Phylogenomic branch length estimation using quartets
Bioinformatics,
Journal Year:
2023,
Volume and Issue:
39(Supplement_1), P. i185 - i193
Published: June 1, 2023
Branch
lengths
and
topology
of
a
species
tree
are
essential
in
most
downstream
analyses,
including
estimation
diversification
dates,
characterization
selection,
understanding
adaptation,
comparative
genomics.
Modern
phylogenomic
analyses
often
use
methods
that
account
for
the
heterogeneity
evolutionary
histories
across
genome
due
to
processes
such
as
incomplete
lineage
sorting.
However,
these
typically
do
not
generate
branch
units
usable
by
applications,
forcing
resort
alternative
shortcuts
estimating
concatenating
gene
alignments
into
supermatrix.
Yet,
concatenation
other
available
approaches
fail
address
genome.
Language: Английский
DISCO+QR: rooting species trees in the presence of GDL and ILS
Bioinformatics Advances,
Journal Year:
2023,
Volume and Issue:
3(1)
Published: Jan. 1, 2023
Genes
evolve
under
processes
such
as
gene
duplication
and
loss
(GDL),
so
that
family
trees
are
multi-copy,
well
incomplete
lineage
sorting
(ILS);
both
produce
differ
from
the
species
tree.
The
estimation
of
sets
is
challenging,
rooted
presents
additional
analytical
challenges.
Two
methods
developed
for
this
problem
STRIDE,
which
roots
by
considering
GDL
events,
Quintet
Rooting
(QR),
ILS.We
present
DISCO+QR,
a
new
approach
to
rooting
first
uses
DISCO
address
then
QR
perform
in
presence
ILS.
DISCO+QR
operates
taking
input
decomposing
them
into
single-copy
using
given
tree
information
QR.
We
show
relative
accuracy
STRIDE
depend
on
properties
dataset
(number
species,
genes,
rate
duplication,
degree
ILS
error),
each
provides
advantages
over
other
some
conditions.DISCO
available
github.Supplementary
data
at
Bioinformatics
Advances
online.
Language: Английский
On the robustness to gene tree rooting (or lack thereof) of triplet-based species tree estimation methods
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2024,
Volume and Issue:
unknown
Published: Nov. 25, 2024
Species
tree
estimation
is
frequently
based
on
phylogenomic
approaches
that
use
multiple
genes
from
throughout
the
genome.
This
process
becomes
particularly
challenging
due
to
gene
heterogeneity
(discordance),
often
resulting
Incomplete
Lineage
Sorting
(ILS).
Triplet-
and
quartet-based
for
species
have
gained
substantial
attention
as
they
are
provably
statistically
consistent
in
presence
of
ILS.
However,
unlike
methods,
limitation
rooted
triplet-based
methods
handling
unrooted
trees
has
restricted
their
adoption
systematics
community.
Furthermore,
since
induced
triplet
distribution
a
depends
placement
root,
accuracy
rooting.
Despite
progress
developing
rooting
trees,
greatly
understudied
choice
technique
downstream
effects
inference
under
realistic
model
conditions.
study
involves
rigorous
empirical
testing
with
different
establish
nuanced
understanding
impact
accuracy.
Moreover,
we
aim
investigate
conditions
which
provide
more
accurate
estimations
than
widely-used
such
ASTRAL.
Language: Английский
DISCO+QR: Rooting Species Trees in the Presence of GDL and ILS
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2023,
Volume and Issue:
unknown
Published: Jan. 3, 2023
A
bstract
Genes
evolve
under
processes
such
as
gene
duplication
and
loss
(GDL),
so
that
family
trees
are
multi-copy,
well
incomplete
lineage
sorting
(ILS);
both
produce
differ
from
the
species
tree.
The
estimation
of
sets
is
challenging,
rooted
presents
additional
analytical
challenges.
Two
methods
developed
for
this
problem
STRIDE
(Emms
Kelly,
MBE
2017),
which
roots
by
considering
GDL
events,
Quintet
Rooting
(Tabatabaee
et
al.,
ISMB
2022
Bioinformatics
2022),
ILS.
We
present
DISCO+QR,
a
new
method
rooting
in
presence
operates
taking
input
decomposing
them
into
single-copy
using
DISCO
(Willson
Systematic
Biology
2022)
then
given
tree
information
(QR).
show
relative
accuracy
DISCO+QR
depend
on
properties
dataset
(number
species,
genes,
rate
duplication,
degree
ILS,
error),
each
provides
advantages
over
other
some
conditions.
Availability:
QR
available
GitHub.
supplementary
materials
at
http://tandy.cs.illinois.edu/discoqr-suppl.pdf
.
Language: Английский
Statistically Consistent Rooting of Species Trees Under the Multispecies Coalescent Model
Lecture notes in computer science,
Journal Year:
2023,
Volume and Issue:
unknown, P. 41 - 57
Published: Jan. 1, 2023
Abstract
Rooted
species
trees
are
used
in
several
downstream
applications
of
phylogenetics.
Most
tree
estimation
methods
produce
unrooted
and
additional
then
to
root
these
trees.
Recently,
Quintet
Rooting
(QR)
(Tabatabaee
et
al.,
ISMB
Bioinformatics
2022),
a
polynomial-time
method
for
rooting
an
given
gene
under
the
multispecies
coalescent,
was
introduced.
QR,
which
is
based
on
proof
identifiability
rooted
5-taxon
presence
incomplete
lineage
sorting,
shown
have
good
accuracy,
improving
over
other
when
sorting
only
cause
discordance,
except
error
very
high.
However,
statistical
consistency
QR
left
as
open
question.
Here,
we
present
QR-STAR,
variant
that
has
step
determining
shape
each
quintet
tree.
We
prove
QR-STAR
statistically
consistent
coalescent
model,
our
simulation
study
shows
matches
or
improves
accuracy
QR.
available
source
form
at
https://github.com/ytabatabaee/Quintet-Rooting
.
Language: Английский
QR-STAR: A Polynomial-Time Statistically Consistent Method for Rooting Species Trees Under the Coalescent
Journal of Computational Biology,
Journal Year:
2023,
Volume and Issue:
30(11), P. 1146 - 1181
Published: Oct. 30, 2023
We
address
the
problem
of
rooting
an
unrooted
species
tree
given
a
set
gene
trees,
under
assumption
that
trees
evolve
within
model
multispecies
coalescent
(MSC)
model.
Quintet
Rooting
(QR)
is
polynomial
time
algorithm
was
recently
proposed
for
this
problem,
which
based
on
theory
developed
by
Allman,
Degnan,
and
Rhodes
proves
identifiability
rooted
5-taxon
from
MSC.
However,
although
QR
had
good
accuracy
in
simulations,
its
statistical
consistency
left
as
open
problem.
present
QR-STAR,
variant
with
additional
step
different
cost
function,
prove
it
statistically
consistent
Moreover,
we
derive
sample
complexity
bounds
QR-STAR
show
particular
"short
quintets"
has
complexity.
Finally,
our
simulation
study
variety
conditions
shows
matches
or
improves
QR.
available
open-source
form
github.
Language: Английский