QR-STAR: A Polynomial-Time Statistically Consistent Method for Rooting Species Trees Under the Coalescent DOI
Yasamin Tabatabaee, Sébastien Roch, Tandy Warnow

et al.

Journal of Computational Biology, Journal Year: 2023, Volume and Issue: 30(11), P. 1146 - 1181

Published: Oct. 30, 2023

We address the problem of rooting an unrooted species tree given a set gene trees, under assumption that trees evolve within model multispecies coalescent (MSC) model. Quintet Rooting (QR) is polynomial time algorithm was recently proposed for this problem, which based on theory developed by Allman, Degnan, and Rhodes proves identifiability rooted 5-taxon from MSC. However, although QR had good accuracy in simulations, its statistical consistency left as open problem. present QR-STAR, variant with additional step different cost function, prove it statistically consistent Moreover, we derive sample complexity bounds QR-STAR show particular "short quintets" has complexity. Finally, our simulation study variety conditions shows matches or improves QR. available open-source form github.

Language: Английский

Species tree branch length estimation despite incomplete lineage sorting, duplication, and loss DOI Open Access
Yasamin Tabatabaee, Chao Zhang, Shayesteh Arasti

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2025, Volume and Issue: unknown

Published: Feb. 21, 2025

Abstract Phylogenetic branch lengths are essential for many analyses, such as estimating divergence times, analyzing rate changes, and studying adaptation. However, true gene tree heterogeneity due to incomplete lineage sorting (ILS), duplication loss (GDL), horizontal transfer (HGT) can complicate the estimation of species lengths. While several tools exist topology a addressing various causes discordance, much less attention has been paid length on multi-locus datasets. For single-copy trees, some methods available that summarize onto tree, including coalescent-based account ILS. no method exists multi-copy family trees have evolved with loss. To address this gap, we introduce CASTLES-Pro algorithm while accounting both GDL improves existing CASTLES by increasing its accuracy extends it handle ones. Our simulation studies show is generally more accurate than alternatives, eliminating systematic bias toward overestimating terminal often observed when using concatenation. Moreover, not theoretically designed HGT, maintains relatively high under rates random HGT. Code availability implemented inside software package ASTER, at https://github.com/chaoszhang/ASTER . Data The datasets scripts used in study https://github.com/ytabatabaee/CASTLES-Pro-paper

Language: Английский

Citations

3

Statistically Consistent Rooting of Species Trees Under the Multispecies Coalescent Model DOI Creative Commons
Yasamin Tabatabaee, Sébastien Roch, Tandy Warnow

et al.

Lecture notes in computer science, Journal Year: 2023, Volume and Issue: unknown, P. 41 - 57

Published: Jan. 1, 2023

Abstract Rooted species trees are used in several downstream applications of phylogenetics. Most tree estimation methods produce unrooted and additional then to root these trees. Recently, Quintet Rooting (QR) (Tabatabaee et al., ISMB Bioinformatics 2022), a polynomial-time method for rooting an given gene under the multispecies coalescent, was introduced. QR, which is based on proof identifiability rooted 5-taxon presence incomplete lineage sorting, shown have good accuracy, improving over other when sorting only cause discordance, except error very high. However, statistical consistency QR left as open question. Here, we present QR-STAR, variant that has step determining shape each quintet tree. We prove QR-STAR statistically consistent coalescent model, our simulation study shows matches or improves accuracy QR. available source form at https://github.com/ytabatabaee/Quintet-Rooting .

Language: Английский

Citations

1

QR-STAR: A Polynomial-Time Statistically Consistent Method for Rooting Species Trees Under the Coalescent DOI
Yasamin Tabatabaee, Sébastien Roch, Tandy Warnow

et al.

Journal of Computational Biology, Journal Year: 2023, Volume and Issue: 30(11), P. 1146 - 1181

Published: Oct. 30, 2023

We address the problem of rooting an unrooted species tree given a set gene trees, under assumption that trees evolve within model multispecies coalescent (MSC) model. Quintet Rooting (QR) is polynomial time algorithm was recently proposed for this problem, which based on theory developed by Allman, Degnan, and Rhodes proves identifiability rooted 5-taxon from MSC. However, although QR had good accuracy in simulations, its statistical consistency left as open problem. present QR-STAR, variant with additional step different cost function, prove it statistically consistent Moreover, we derive sample complexity bounds QR-STAR show particular "short quintets" has complexity. Finally, our simulation study variety conditions shows matches or improves QR. available open-source form github.

Language: Английский

Citations

0