Accurate, scalable, and fully automated inference of species trees from raw genome assemblies using ROADIES DOI Creative Commons
Anshu Gupta, Siavash Mirarab, Yatish Turakhia

et al.

Proceedings of the National Academy of Sciences, Journal Year: 2025, Volume and Issue: 122(19)

Published: May 2, 2025

Current genome sequencing initiatives across a wide range of life forms offer significant potential to enhance our understanding evolutionary relationships and support transformative biological medical applications. Species trees play central role in many these applications; however, despite the widespread availability assemblies, accurate inference species remains challenging due limited automation, substantial domain expertise, computational resources required by conventional methods. To address this limitation, we present ROADIES, fully automated pipeline infer starting from raw assemblies. In contrast prominent approach, ROADIES incorporates unique strategy randomly sampling segments input genomes generate gene trees. This eliminates need for predefining set loci, limiting analyses fixed number genes, performing cumbersome annotation and/or whole alignment steps. also orthology leveraging existing discordance-aware methods that allow multicopy genes. Using genomic datasets large-scale efforts four diverse (placental mammals, pomace flies, birds, budding yeasts), show infers are comparable quality state-of-the-art studies but fraction time effort, including on with rampant tree discordance complex polyploidy. With its speed, accuracy, has vastly simplify inference, making it accessible broader scientists

Language: Английский

Accurate, scalable, and fully automated inference of species trees from raw genome assemblies using ROADIES DOI Creative Commons
Anshu Gupta, Siavash Mirarab, Yatish Turakhia

et al.

Proceedings of the National Academy of Sciences, Journal Year: 2025, Volume and Issue: 122(19)

Published: May 2, 2025

Current genome sequencing initiatives across a wide range of life forms offer significant potential to enhance our understanding evolutionary relationships and support transformative biological medical applications. Species trees play central role in many these applications; however, despite the widespread availability assemblies, accurate inference species remains challenging due limited automation, substantial domain expertise, computational resources required by conventional methods. To address this limitation, we present ROADIES, fully automated pipeline infer starting from raw assemblies. In contrast prominent approach, ROADIES incorporates unique strategy randomly sampling segments input genomes generate gene trees. This eliminates need for predefining set loci, limiting analyses fixed number genes, performing cumbersome annotation and/or whole alignment steps. also orthology leveraging existing discordance-aware methods that allow multicopy genes. Using genomic datasets large-scale efforts four diverse (placental mammals, pomace flies, birds, budding yeasts), show infers are comparable quality state-of-the-art studies but fraction time effort, including on with rampant tree discordance complex polyploidy. With its speed, accuracy, has vastly simplify inference, making it accessible broader scientists

Language: Английский

Citations

0