Species tree branch length estimation despite incomplete lineage sorting, duplication, and loss
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2025,
Volume and Issue:
unknown
Published: Feb. 21, 2025
Abstract
Phylogenetic
branch
lengths
are
essential
for
many
analyses,
such
as
estimating
divergence
times,
analyzing
rate
changes,
and
studying
adaptation.
However,
true
gene
tree
heterogeneity
due
to
incomplete
lineage
sorting
(ILS),
duplication
loss
(GDL),
horizontal
transfer
(HGT)
can
complicate
the
estimation
of
species
lengths.
While
several
tools
exist
topology
a
addressing
various
causes
discordance,
much
less
attention
has
been
paid
length
on
multi-locus
datasets.
For
single-copy
trees,
some
methods
available
that
summarize
onto
tree,
including
coalescent-based
account
ILS.
no
method
exists
multi-copy
family
trees
have
evolved
with
loss.
To
address
this
gap,
we
introduce
CASTLES-Pro
algorithm
while
accounting
both
GDL
improves
existing
CASTLES
by
increasing
its
accuracy
extends
it
handle
ones.
Our
simulation
studies
show
is
generally
more
accurate
than
alternatives,
eliminating
systematic
bias
toward
overestimating
terminal
often
observed
when
using
concatenation.
Moreover,
not
theoretically
designed
HGT,
maintains
relatively
high
under
rates
random
HGT.
Code
availability
implemented
inside
software
package
ASTER,
at
https://github.com/chaoszhang/ASTER
.
Data
The
datasets
scripts
used
in
study
https://github.com/ytabatabaee/CASTLES-Pro-paper
Language: Английский
Statistically Consistent Rooting of Species Trees Under the Multispecies Coalescent Model
Lecture notes in computer science,
Journal Year:
2023,
Volume and Issue:
unknown, P. 41 - 57
Published: Jan. 1, 2023
Abstract
Rooted
species
trees
are
used
in
several
downstream
applications
of
phylogenetics.
Most
tree
estimation
methods
produce
unrooted
and
additional
then
to
root
these
trees.
Recently,
Quintet
Rooting
(QR)
(Tabatabaee
et
al.,
ISMB
Bioinformatics
2022),
a
polynomial-time
method
for
rooting
an
given
gene
under
the
multispecies
coalescent,
was
introduced.
QR,
which
is
based
on
proof
identifiability
rooted
5-taxon
presence
incomplete
lineage
sorting,
shown
have
good
accuracy,
improving
over
other
when
sorting
only
cause
discordance,
except
error
very
high.
However,
statistical
consistency
QR
left
as
open
question.
Here,
we
present
QR-STAR,
variant
that
has
step
determining
shape
each
quintet
tree.
We
prove
QR-STAR
statistically
consistent
coalescent
model,
our
simulation
study
shows
matches
or
improves
accuracy
QR.
available
source
form
at
https://github.com/ytabatabaee/Quintet-Rooting
.
Language: Английский
QR-STAR: A Polynomial-Time Statistically Consistent Method for Rooting Species Trees Under the Coalescent
Journal of Computational Biology,
Journal Year:
2023,
Volume and Issue:
30(11), P. 1146 - 1181
Published: Oct. 30, 2023
We
address
the
problem
of
rooting
an
unrooted
species
tree
given
a
set
gene
trees,
under
assumption
that
trees
evolve
within
model
multispecies
coalescent
(MSC)
model.
Quintet
Rooting
(QR)
is
polynomial
time
algorithm
was
recently
proposed
for
this
problem,
which
based
on
theory
developed
by
Allman,
Degnan,
and
Rhodes
proves
identifiability
rooted
5-taxon
from
MSC.
However,
although
QR
had
good
accuracy
in
simulations,
its
statistical
consistency
left
as
open
problem.
present
QR-STAR,
variant
with
additional
step
different
cost
function,
prove
it
statistically
consistent
Moreover,
we
derive
sample
complexity
bounds
QR-STAR
show
particular
"short
quintets"
has
complexity.
Finally,
our
simulation
study
variety
conditions
shows
matches
or
improves
QR.
available
open-source
form
github.
Language: Английский