Median quartet tree search algorithms using optimal subtree prune and regraft DOI Creative Commons
Shayesteh Arasti, Siavash Mirarab

Algorithms for Molecular Biology, Journal Year: 2024, Volume and Issue: 19(1)

Published: March 13, 2024

Abstract Gene trees can be different from the species tree due to biological processes and inference errors. One way obtain a is find one that maximizes some measure of similarity set gene trees. The number shared quartets between potential provides statistically justifiable score; if maximized properly, it could result in consistent estimator under several statistical models discordance. However, finding median quartet score tree, this score, NP-Hard, motivating existing heuristic algorithms. These heuristics do not follow hill-climbing paradigm used extensively phylogenetics. In paper, we make theoretical contributions enable an efficient approach. Specifically, show subtree size m placed optimally on n quasi-linear time with respect (almost) independently . This enables us perform prune regraft (SPR) rearrangements as part search. We approach slightly improve upon results widely-used methods such ASTRAL terms optimization but necessarily accuracy.

Language: Английский

CASTER: Direct species tree inference from whole-genome alignments DOI
Chao Zhang, Rasmus Nielsen, Siavash Mirarab

et al.

Science, Journal Year: 2025, Volume and Issue: unknown

Published: Jan. 23, 2025

Genomes contain mosaics of discordant evolutionary histories, challenging the accurate inference tree life. While genome-wide data are routinely used for discordance-aware phylogenomic analyses, due to modeling and scalability limitations, current practice leaves out large chunks genomes. As more high-quality genomes become available, we urgently need methods infer directly from a multiple genome alignment. Here, introduce CASTER, theoretically justified site-based method that eliminates predefine recombination-free loci. CASTER is scalable hundreds mammalian whole We demonstrate accuracy in simulations include recombination apply several biological datasets, showing its per-site scores can reveal both artefactual patterns discordance across genome.

Language: Английский

Citations

2

The meaning and measure of concordance factors in phylogenomics DOI Creative Commons
Robert Lanfear, Matthew W. Hahn

Published: Jan. 3, 2024

As phylogenomic datasets have grown in size, researchers developed new ways to measure biological variation and assess statistical support. Larger many more sites loci, therefore less sampling variance. While this means that we can accurately the mean signal these datasets. lower variance is often reflected widely used measures of branch support— such as bootstrap posterior probability—being uniformly high, limiting their utility. also revealed a large amount topologies found across individual single species tree inferred by most phylogenetic methods represents limited summary data. In contrast support, degree underlying topological among or loci should be approximately constant regardless size dataset. “Concordance factors” similar statistics become increasingly important tools phylogenetics. review, explain why concordance factors thought descriptors variation, rather than argue they provide information not contained We review growing suite derived from various measuring concordance, comparing them common framework reveals interrelationships. discuss how might change future move beyond estimating “tree life” towards myriad evolutionary histories genomic variation.

Language: Английский

Citations

7

The meaning and measure of concordance factors in phylogenomics DOI Creative Commons
Robert Lanfear, Matthew W. Hahn

Molecular Biology and Evolution, Journal Year: 2024, Volume and Issue: 41(11)

Published: Oct. 17, 2024

Abstract As phylogenomic datasets have grown in size, researchers developed new ways to measure biological variation and assess statistical support for specific branches. Larger more sites loci therefore less sampling variance. While we can accurately the mean signal these datasets, lower variance is often reflected uniformly high measures of branch support—such as bootstrap posterior probability—limiting their utility. also revealed substantial topologies found across individual loci, such that single species tree inferred by most phylogenetic methods represents a limited summary data many purposes. In contrast support, degree underlying topological among should be approximately constant regardless size dataset. “Concordance factors” (CFs) similar statistics become increasingly important tools phylogenetics. this review, explain why CFs thought descriptors rather than argue they provide information about predictive power not contained support. We review growing suite measuring concordance, compare them common framework reveals interrelationships, demonstrate how calculate using an example from birds. discuss might change future move beyond estimating “tree life” toward myriad evolutionary histories genomic variation.

Language: Английский

Citations

4

WASTER: Practicalde novophylogenomics from low-coverage short reads DOI Creative Commons
Chao Zhang, Rasmus Nielsen

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2025, Volume and Issue: unknown

Published: Jan. 24, 2025

The advent of affordable whole-genome sequencing has spurred numerous large-scale projects aimed at inferring the tree life, yet achieving a complete species-level phylogeny remains distant goal due to significant costs and computational demands. Traditional species inference methods, though effective, are hampered by need for high-coverage sequencing, high-quality genomic alignments, extensive resources. To address these challenges, this study introduces WASTER, novel de novo tool trees directly from short-read sequences. WASTER employs k-mer based approach identifying variable sites, circumventing genome assembly alignment. Using simulations, we demonstrate that achieves accuracy comparable traditional alignment-based even low depth, substantially higher than other alignment-free methods. We validate WASTER's efficacy on real data, where it accurately reconstructs phylogenies eukaryotic with as depth 1.5X. provides fast efficient solution estimation in cases and/or alignment may bias analyses or is challenging, example depth. It also method generating guide tree-based algorithms. ability estimate low-coverage data without relying will lead reduced phylogenomic projects.

Language: Английский

Citations

0

Phylogenomic and morphological evidence supports the reinstatement of the bamboo genus Clavinodum from Oligostachyum (Poaceae: Bambusoideae) DOI Creative Commons
Zhengyang Niu, Zhixian Zhang,

Zhuoyu Cai

et al.

Molecular Phylogenetics and Evolution, Journal Year: 2025, Volume and Issue: unknown, P. 108327 - 108327

Published: March 1, 2025

Language: Английский

Citations

0

Phylogenomic analyses of all species of swordtail fishes (genus Xiphophorus) show that hybridization preceded speciation DOI Creative Commons
Kang Du, Juliana M.B. Ricci,

Yuan Lu

et al.

Nature Communications, Journal Year: 2024, Volume and Issue: 15(1)

Published: Aug. 4, 2024

Abstract Hybridization has been recognized to play important roles in evolution, however studies of the genetic consequence are still lagging behind vertebrates due lack appropriate experimental systems. Fish genus Xiphophorus proposed have evolved with multiple ancient and ongoing hybridization events. They served as an informative research model evolutionary biology biomedical on human disease for more than a century. Here, we provide complete genomic resource including annotations all described 26 species three undescribed taxa resolve uncertain phylogenetic relationships. We investigate molecular evolution genes related cancers such melanoma control puberty timing, focusing that predicted be involved pre-and postzygotic isolation thus affect hybridization. discovered dramatic size-variation some gene families. These persisted despite reticulate rapid speciation short divergence time. Finally, clarify history entire settling disputed two Southern swordtails. Our comparative analyses revealed ancestries manifested mosaic fused genomes show often preceded speciation.

Language: Английский

Citations

2

Contact zones reveal restricted introgression despite frequent hybridization across a recent lizard radiation DOI Creative Commons
Stephen M. Zozaya, SCOTT A. MACOR,

Rhiannon Schembri

et al.

Evolution, Journal Year: 2024, Volume and Issue: unknown

Published: Dec. 6, 2024

Introgression - the exchange of genetic material through hybridization is now recognized as common among animal species. The extent introgression, however, can vary considerably even when it occurs: for example, introgression be geographically restricted or so pervasive that populations merge. Such variation highlights importance understanding factors mediating introgression. Here we used genome-wide SNP data to assess and at 32 contact zones, comprising 21 phylogenetic independent contrasts across a recent lizard radiation (Heteronotia). We then tested relationship between (average admixture zones) genomic divergence contrasts. Early generation hybrids were detected zones spanning range included here. Despite this, found remarkably rare and, observed, restricted. Only two most genomically similar population pairs showed beyond 5 km zone. dropped precipitously only modest levels divergence, which was absent extremely low. Our results contrast with growing number studies indicating prevalent animals, suggesting groups will in their propensity

Language: Английский

Citations

1

Median quartet tree search algorithms using optimal subtree prune and regraft DOI Creative Commons
Shayesteh Arasti, Siavash Mirarab

Algorithms for Molecular Biology, Journal Year: 2024, Volume and Issue: 19(1)

Published: March 13, 2024

Abstract Gene trees can be different from the species tree due to biological processes and inference errors. One way obtain a is find one that maximizes some measure of similarity set gene trees. The number shared quartets between potential provides statistically justifiable score; if maximized properly, it could result in consistent estimator under several statistical models discordance. However, finding median quartet score tree, this score, NP-Hard, motivating existing heuristic algorithms. These heuristics do not follow hill-climbing paradigm used extensively phylogenetics. In paper, we make theoretical contributions enable an efficient approach. Specifically, show subtree size m placed optimally on n quasi-linear time with respect (almost) independently . This enables us perform prune regraft (SPR) rearrangements as part search. We approach slightly improve upon results widely-used methods such ASTRAL terms optimization but necessarily accuracy.

Language: Английский

Citations

0