wQFM-DISCO: DISCO-enabled wQFM improves phylogenomic analyses despite the presence of paralogs DOI Creative Commons
Sheikh Azizul Hakim, Md. Rownok Zahan Ratul, Md. Shamsuzzoha Bayzid

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2023, Номер unknown

Опубликована: Дек. 7, 2023

Abstract Gene trees often differ from the species that contain them due to various factors, including incomplete lineage sorting (ILS), gene duplication and loss (GDL), horizontal transfer (HGT). Several highly accurate tree estimation methods have been introduced explicitly address ILS, AS-TRAL, a widely used statistically consistent method, wQFM, quartet amalgamation approach is experimentally shown be more than ASTRAL. Two recent advancements, ASTRAL-Pro DISCO, emerged in field of phylogenomics consider (GDL) events. introduces refined measure similarity, accounting for both orthology paralogy. on other hand, offers general strategy decompose multicopy family into collection single-copy trees, allowing utilization previously designed inference context trees. In this study, we first introduce some variants DISCO examine its underlying hypotheses present analytical results statistical guarantees DISCO. particular, DISCO-R, variant with improved pruning provides robust results. We then propose wQFM-DISCO (wQFM paired DISCO) as an adaptation wQFM handle resulting GDL Extensive evaluation studies simulated real data sets demonstrate significantly competing methods.

Язык: Английский

wQFM-DISCO: DISCO-enabled wQFM improves phylogenomic analyses despite the presence of paralogs DOI Creative Commons
Sheikh Azizul Hakim, Md. Rownok Zahan Ratul, Md. Shamsuzzoha Bayzid

и другие.

Bioinformatics Advances, Год журнала: 2024, Номер 4(1)

Опубликована: Янв. 1, 2024

Abstract Motivation Gene trees often differ from the species that contain them due to various factors, including incomplete lineage sorting (ILS) and gene duplication loss (GDL). Several highly accurate tree estimation methods have been introduced explicitly address ILS, ASTRAL, a widely used statistically consistent method, wQFM, quartet amalgamation approach experimentally shown be more than ASTRAL. Two recent advancements, ASTRAL-Pro DISCO, emerged in phylogenomics consider GDL. introduces refined similarity measure, accounting for orthology paralogy. On other hand, DISCO offers general strategy decompose multi-copy into collection of single-copy trees, allowing utilization previously designed inference context trees. Results In this study, we first introduce some variants examine its underlying hypotheses present analytical results on statistical guarantees DISCO. particular, DISCO-R, variant with improved pruning provides robust results. We then demonstrate extensive evaluation studies simulated real data sets wQFM paired consistently matches or outperforms competing methods. Availability implementation DISCO-R are freely available at https://github.com/skhakim/DISCO-variants.

Язык: Английский

Процитировано

1

wQFM-TREE: highly accurate and scalable quartet-based species tree inference from gene trees DOI Open Access
Abdur Rafi, Ahmed Mahir Sultan Rumi, Sheikh Azizul Hakim

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2024, Номер unknown

Опубликована: Июль 31, 2024

Abstract Summary methods are becoming increasingly popular for species tree estimation from multi-locus data in the presence of gene discordance. ASTRAL, a leading method this class, solves Maximum Quartet Support Species Tree problem within constrained solution space constructed input trees. In contrast, alternative heuristics such as wQFM and wQMC operate by taking set weighted quartets employ divide-and-conquer strategy to construct tree. Recent studies showed be more accurate than ASTRAL wQMC, though its scalability is hindered computational demands explicitly generating weighting Θ( n 4 ) quartets. Here, we introduce wQFM-TREE, novel summary that enhances circumventing need explicit quartet generation weighting, thereby enabling application large datasets. Unlike wQFM, wQFM-TREE can also handle polytomies. Extensive simulations under diverse challenging model conditions, with hundreds or thousands taxa genes, consistently demonstrate matches improves upon accuracy ASTRAL. Specifically, outperformed 25 27 conditions analyzed study involving 200-1000 taxa, statistically significant differences 20 these conditions. Moreover, applied re-analyze green plant dataset One Thousand Plant Transcriptomes Initiative. Its remarkable position highly competitive field. Additionally, algorithmic combinatorial innovations introduced will benefit various quartet-based computations, advancing state-of-the-art phylogenetic estimations.

Язык: Английский

Процитировано

0

QT-WEAVER: Correcting quartet distribution improves phylogenomic analyses despite gene tree estimation error DOI Creative Commons
Navid Bin Hasan,

Sohaib,

Md. Shamsuzzoha Bayzid

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2024, Номер unknown

Опубликована: Ноя. 12, 2024

Abstract Summarizing individual gene trees into species phylogenies using coalescent-based methods has become a standard approach in phylogenomics. However, tree estimation error (GTEE) arising from combination of reasons (ranging analytical factors to more biological causes, as short sequences) can potentially impact the accuracy phylogenomic inference. We, for first time, introduce problem correcting quartet distribution induced by set estimated trees, which involves updating weights quartets better reflect their relative importance within distribution. We present QT-WEAVER, method its kind, learns conflicts given and generates an updated adjusting accordingly. QT-WEAVER is general- purpose technique needing no explicit modeling subject system or GTEE heterogeneity. Experimental studies on collection simulated empirical data sets suggest that effectively account GTEE, results substantial improvement accuracy. Additionally, concept related algorithmic combinatorial innovations introduced this study will benefit various quartet-based computations. Therefore, advances state-of-the-art face GTEE. freely available open-source form at https://github.com/navidh86/QT-WEAVER .

Язык: Английский

Процитировано

0

wQFM-DISCO: DISCO-enabled wQFM improves phylogenomic analyses despite the presence of paralogs DOI Creative Commons
Sheikh Azizul Hakim, Md. Rownok Zahan Ratul, Md. Shamsuzzoha Bayzid

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2023, Номер unknown

Опубликована: Дек. 7, 2023

Abstract Gene trees often differ from the species that contain them due to various factors, including incomplete lineage sorting (ILS), gene duplication and loss (GDL), horizontal transfer (HGT). Several highly accurate tree estimation methods have been introduced explicitly address ILS, AS-TRAL, a widely used statistically consistent method, wQFM, quartet amalgamation approach is experimentally shown be more than ASTRAL. Two recent advancements, ASTRAL-Pro DISCO, emerged in field of phylogenomics consider (GDL) events. introduces refined measure similarity, accounting for both orthology paralogy. on other hand, offers general strategy decompose multicopy family into collection single-copy trees, allowing utilization previously designed inference context trees. In this study, we first introduce some variants DISCO examine its underlying hypotheses present analytical results statistical guarantees DISCO. particular, DISCO-R, variant with improved pruning provides robust results. We then propose wQFM-DISCO (wQFM paired DISCO) as an adaptation wQFM handle resulting GDL Extensive evaluation studies simulated real data sets demonstrate significantly competing methods.

Язык: Английский

Процитировано

0