bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2022,
Volume and Issue:
unknown
Published: Oct. 27, 2022
Abstract
Rooted
species
trees
are
used
in
several
downstream
applications
of
phylogenetics.
Most
tree
estimation
methods
produce
unrooted
and
additional
then
to
root
these
trees.
Recently,
Quintet
Rooting
(QR)
(Tabatabaee
et
al.,
ISMB
Bioinformatics
2022),
a
polynomial-time
method
for
rooting
an
given
gene
under
the
multispecies
coalescent,
was
introduced.
QR,
which
is
based
on
proof
identifiability
rooted
5-taxon
presence
incomplete
lineage
sorting,
shown
have
good
accuracy,
improving
over
other
when
sorting
only
cause
discordance,
except
error
very
high.
However,
statistical
consistency
QR
left
as
open
question.
Here,
we
present
QR-STAR,
variant
that
has
step
determining
shape
each
quintet
tree.
We
prove
QR-STAR
statistically
consistent
coalescent
model.
Our
simulation
study
variety
model
conditions
shows
matches
or
improves
accuracy
QR.
available
source
form
at
https://github.com/ytabatabaee/Quintet-Rooting
.
Science,
Journal Year:
2023,
Volume and Issue:
381(6665)
Published: Sept. 28, 2023
Although
some
lineages
of
animals
and
plants
have
made
impressive
adaptive
radiations
when
provided
with
ecological
opportunity,
the
propensities
to
radiate
vary
profoundly
among
for
unknown
reasons.
In
Africa's
Lake
Victoria
region,
one
cichlid
lineage
radiated
in
every
lake,
largest
radiation
taking
place
a
lake
less
than
16,000
years
old.
We
show
that
all
its
guilds
evolved
situ.
Cycles
fusion
through
admixture
fission
speciation
characterize
history
radiation.
It
was
jump-started
several
swamp-dwelling
refugial
populations,
each
which
were
older
hybrid
descent,
met
newly
forming
where
they
fused
into
single
population,
resuspending
old
variation.
Each
population
contributed
different
set
ancient
alleles
from
new
assembled
record
time,
involving
additional
fusion-fission
cycles.
argue
repeated
cycles
make
fast
predictable.
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2025,
Volume and Issue:
unknown
Published: April 10, 2025
Abstract
Species
tree
estimation
from
genome-wide
data
has
transformed
evolutionary
studies,
particularly
in
the
presence
of
gene
discordance.
Gene
trees
often
differ
species
due
to
factors
like
incomplete
lineage
sorting
(ILS)
and
duplication
loss
(GDL).
Quartet-based
methods
have
gained
substantial
popularity
for
their
accuracy
statistical
guarantee.
However,
most
these
(e.g.,
ASTRAL,
wQFM,
wQMC)
rely
on
single-copy
models
ILS
not
GDL,
limiting
applicability
large
genomic
datasets.
ASTRAL-Pro,
a
recent
advancement,
refined
quartet
similarity
measures
incorporate
both
orthology
paralogy,
improving
inference
under
GDL.
Among
other
quartet-based
methods,
wQFM-DISCO
converts
multicopy
family
into
using
DISCO
applies
wQFM
algorithm
trees.
ASTRAL-Pro
remained
only
summary
method
explicitly
model
loss.
In
this
study,
we
extend
(which
requires
decomposition)
wQFM-TREE
operates
directly
trees)
by
modeling
loss,
leveraging
concept
speciation-driven
quartets
introduced
ASTRAL-Pro.
Our
consistently
outperforms
across
conditions,
offering
promising
alternative
Journal of Computational Biology,
Journal Year:
2022,
Volume and Issue:
29(11), P. 1156 - 1172
Published: Sept. 1, 2022
Species
tree
estimation
is
frequently
based
on
phylogenomic
approaches
that
use
multiple
genes
from
throughout
the
genome.
However,
for
a
combination
of
reasons
(ranging
sampling
biases
to
more
biological
causes,
as
in
gene
birth
and
loss),
trees
are
often
incomplete,
meaning
not
all
species
interest
have
common
set
genes.
Incomplete
can
potentially
impact
accuracy
inference.
We,
first
time,
introduce
problem
imputing
quartet
distribution
induced
by
incomplete
trees,
which
involves
adding
missing
quartets
back
distribution.
We
present
Quartet
Gene
Imputation
using
Deep
Learning
(QT-GILD),
an
automated
specially
tailored
unsupervised
deep
learning
technique,
accompanied
cues
natural
language
processing,
learns
given
generates
complete
accordingly.
QT-GILD
general-purpose
technique
needing
no
explicit
modeling
subject
system
or
data
heterogeneity.
Experimental
studies
collection
simulated
empirical
datasets
suggest
effectively
impute
distribution,
results
dramatic
improvement
accuracy.
Remarkably,
only
imputes
but
also
account
error.
Therefore,
advances
state-of-the-art
face
data.
Algorithms for Molecular Biology,
Journal Year:
2023,
Volume and Issue:
18(1)
Published: July 19, 2023
Species
tree
estimation
is
a
basic
step
in
many
biological
research
projects,
but
complicated
by
the
fact
that
gene
trees
can
differ
from
species
due
to
processes
such
as
incomplete
lineage
sorting
(ILS),
duplication
and
loss
(GDL),
horizontal
transfer
(HGT),
which
cause
different
regions
within
genome
have
evolutionary
histories
(i.e.,
"gene
heterogeneity").
One
approach
estimating
presence
of
heterogeneity
resulting
ILS
operates
computing
on
each
genomic
region
trees")
then
using
these
define
matrix
average
internode
distances,
where
distance
T
between
two
x
y
number
nodes
leaves
corresponding
y.
Given
matrix,
be
computed
methods
neighbor
joining.
Methods
ASTRID
NJst
(which
use
this
approach)
are
provably
statistically
consistent,
very
fast
(low
degree
polynomial
time)
had
high
accuracy
under
conditions
makes
them
competitive
with
other
popular
methods.
In
study,
inspired
recent
work
weighted
ASTRAL,
we
present
ASTRID,
variant
takes
branch
uncertainty
into
account
distance.Our
experimental
study
evaluating
typically
shows
improvements
compared
original
(unweighted)
against
state
art.
Our
re-implementation
also
improves
runtime,
marked
large
datasets.Weighted
new
method
for
upon
has
comparable
while
remaining
much
faster.
Weighted
available
at
https://github.com/RuneBlaze/internode
.
Bioinformatics,
Journal Year:
2023,
Volume and Issue:
39(6)
Published: June 1, 2023
Abstract
Motivation
With
the
recent
breakthroughs
in
sequencing
technology,
phylogeny
estimation
at
a
larger
scale
has
become
huge
opportunity.
For
accurate
of
large-scale
phylogeny,
substantial
endeavor
is
being
devoted
introducing
new
algorithms
or
upgrading
current
approaches.
In
this
work,
we
to
improve
Quartet
Fiduccia
and
Mattheyses
(QFM)
algorithm
resolve
phylogenetic
trees
better
quality
with
running
time.
QFM
was
already
appreciated
by
researchers
for
its
good
tree
quality,
but
fell
short
phylogenomic
studies
due
excessively
slow
Results
We
have
re-designed
so
that
it
can
amalgamate
millions
quartets
over
thousands
taxa
into
species
great
level
accuracy
within
amount
Named
“QFM
Fast
Improved
(QFM-FI)”,
our
version
20
000×
faster
than
previous
400×
widely
used
variant
implemented
PAUP*
on
datasets.
also
provided
theoretical
analysis
time
memory
requirements
QFM-FI.
conducted
comparative
study
QFM-FI
other
state-of-the-art
reconstruction
methods,
such
as
QFM,
QMC,
wQMC,
wQFM,
ASTRAL,
simulated
well
real
biological
Our
results
show
improves
produces
are
comparable
methods.
Availability
implementation
open
source
available
https://github.com/sharmin-mim/qfm_java.
Bioinformatics Advances,
Journal Year:
2024,
Volume and Issue:
4(1)
Published: Jan. 1, 2024
Abstract
Motivation
Gene
trees
often
differ
from
the
species
that
contain
them
due
to
various
factors,
including
incomplete
lineage
sorting
(ILS)
and
gene
duplication
loss
(GDL).
Several
highly
accurate
tree
estimation
methods
have
been
introduced
explicitly
address
ILS,
ASTRAL,
a
widely
used
statistically
consistent
method,
wQFM,
quartet
amalgamation
approach
experimentally
shown
be
more
than
ASTRAL.
Two
recent
advancements,
ASTRAL-Pro
DISCO,
emerged
in
phylogenomics
consider
GDL.
introduces
refined
similarity
measure,
accounting
for
orthology
paralogy.
On
other
hand,
DISCO
offers
general
strategy
decompose
multi-copy
into
collection
of
single-copy
trees,
allowing
utilization
previously
designed
inference
context
trees.
Results
In
this
study,
we
first
introduce
some
variants
examine
its
underlying
hypotheses
present
analytical
results
on
statistical
guarantees
DISCO.
particular,
DISCO-R,
variant
with
improved
pruning
provides
robust
results.
We
then
demonstrate
extensive
evaluation
studies
simulated
real
data
sets
wQFM
paired
consistently
matches
or
outperforms
competing
methods.
Availability
implementation
DISCO-R
are
freely
available
at
https://github.com/skhakim/DISCO-variants.
Bioinformatics,
Journal Year:
2022,
Volume and Issue:
38(Supplement_1), P. i109 - i117
Published: April 14, 2022
Rooted
species
trees
are
a
basic
model
with
multiple
applications
throughout
biology,
including
understanding
adaptation,
biodiversity,
phylogeography
and
co-evolution.
Because
most
tree
estimation
methods
produce
unrooted
trees,
for
rooting
these
have
been
developed.
However,
either
rely
on
prior
biological
knowledge
or
assume
that
evolution
is
close
to
clock-like,
which
not
usually
the
case.
Furthermore,
do
account
processes
create
discordance
between
gene
trees.We
present
Quintet
Rooting
(QR),
method
based
proof
of
identifiability
rooted
under
multi-species
coalescent
established
by
Allman,
Degnan
Rhodes
(J.
Math.
Biol.,
2011).
We
show
QR
generally
more
accurate
than
other
methods,
except
extreme
levels
error.Quintet
available
in
open
source
form
at
https://github.com/ytabatabaee/Quintet-Rooting.
The
simulated
datasets
used
this
study
from
https://www.ideals.illinois.edu/handle/2142/55319.
dataset
also
http://gigadb.org/dataset/101041.Supplementary
data
Bioinformatics
online.
Genome Research,
Journal Year:
2023,
Volume and Issue:
unknown
Published: May 17, 2023
Summary
methods
are
widely
used
to
estimate
species
trees
from
genome-scale
data.
However,
they
can
fail
produce
accurate
when
the
input
gene
highly
discordant
because
of
estimation
error
and
biological
processes,
such
as
incomplete
lineage
sorting.
Here,
we
introduce
TREE-QMC,
a
new
summary
method
that
offers
accuracy
scalability
under
these
challenging
scenarios.
TREE-QMC
builds
upon
weighted
Quartet
Max
Cut,
which
takes
quartets
then
constructs
tree
in
divide-and-conquer
fashion,
at
each
step
forming
graph
seeking
its
max
cut.
The
wQMC
has
been
successfully
leveraged
context
by
weighting
their
frequencies
trees;
improve
this
approach
two
ways.
First,
address
normalizing
quartet
weights
account
for
“artificial
taxa”
introduced
during
divide
phase
so
subproblem
solutions
be
combined
conquer
phase.
Second,
introducing
an
algorithm
construct
directly
gives
time
complexity
O
(
n
3
k
),
where
is
number
trees,
assuming
decomposition
perfectly
balanced.
These
contributions
enable
competitive
terms
empirical
runtime
with
leading
quartet-based
methods,
even
outperforming
them
on
some
model
conditions
explored
our
simulation
study.
We
also
present
application
avian
phylogenomics
data
set.
Bulletin of Mathematical Biology,
Journal Year:
2023,
Volume and Issue:
85(7)
Published: June 13, 2023
Abstract
Homogeneity
across
lineages
is
a
general
assumption
in
phylogenetics
according
to
which
nucleotide
substitution
rates
are
common
all
lineages.
Many
phylogenetic
methods
relax
this
hypothesis
but
keep
simple
enough
model
make
the
process
of
sequence
evolution
more
tractable.
On
other
hand,
dealing
successfully
with
case
(heterogeneity
lineages)
one
key
features
reconstruction
based
on
algebraic
tools.
The
goal
paper
twofold.
First,
we
present
new
weighting
system
for
quartets
()
and
semi-algebraic
tools,
thus
especially
indicated
deal
data
evolving
under
heterogeneous
rates.
This
method
combines
weights
two
previous
by
means
test
positivity
branch
lengths
estimated
paralinear
distance.
statistically
consistent
when
applied
generated
Markov
model,
considers
rate
base
composition
heterogeneity
among
does
not
assume
stationarity
nor
time-reversibility.
Second,
compare
performance
several
quartet-based
tree
(namely
QFM,
wQFM,
quartet
puzzling,
weight
optimization
Willson’s
method)
combination
systems
weights,
including
or
These
tests
both
simulated
real
support
as
reliable
successful
that
improves
upon
accuracy
global
(such
neighbor-joining
maximum
likelihood)
presence
long
branches
mixtures
distributions
trees.