QT-WEAVER: Correcting quartet distribution improves phylogenomic analyses despite gene tree estimation error DOI Creative Commons
Navid Bin Hasan,

Sohaib,

Md. Shamsuzzoha Bayzid

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: Nov. 12, 2024

Abstract Summarizing individual gene trees into species phylogenies using coalescent-based methods has become a standard approach in phylogenomics. However, tree estimation error (GTEE) arising from combination of reasons (ranging analytical factors to more biological causes, as short sequences) can potentially impact the accuracy phylogenomic inference. We, for first time, introduce problem correcting quartet distribution induced by set estimated trees, which involves updating weights quartets better reflect their relative importance within distribution. We present QT-WEAVER, method its kind, learns conflicts given and generates an updated adjusting accordingly. QT-WEAVER is general- purpose technique needing no explicit modeling subject system or GTEE heterogeneity. Experimental studies on collection simulated empirical data sets suggest that effectively account GTEE, results substantial improvement accuracy. Additionally, concept related algorithmic combinatorial innovations introduced this study will benefit various quartet-based computations. Therefore, advances state-of-the-art face GTEE. freely available open-source form at https://github.com/navidh86/QT-WEAVER .

Language: Английский

Quartet Based Gene Tree Imputation Using Deep Learning Improves Phylogenomic Analyses Despite Missing Data DOI
Sazan Mahbub, Shashata Sawmya, Arpita Saha

et al.

Journal of Computational Biology, Journal Year: 2022, Volume and Issue: 29(11), P. 1156 - 1172

Published: Sept. 1, 2022

Species tree estimation is frequently based on phylogenomic approaches that use multiple genes from throughout the genome. However, for a combination of reasons (ranging sampling biases to more biological causes, as in gene birth and loss), trees are often incomplete, meaning not all species interest have common set genes. Incomplete can potentially impact accuracy inference. We, first time, introduce problem imputing quartet distribution induced by incomplete trees, which involves adding missing quartets back distribution. We present Quartet Gene Imputation using Deep Learning (QT-GILD), an automated specially tailored unsupervised deep learning technique, accompanied cues natural language processing, learns given generates complete accordingly. QT-GILD general-purpose technique needing no explicit modeling subject system or data heterogeneity. Experimental studies collection simulated empirical datasets suggest effectively impute distribution, results dramatic improvement accuracy. Remarkably, only imputes but also account error. Therefore, advances state-of-the-art face data.

Language: Английский

Citations

9

Quartet Fiduccia–Mattheyses revisited for larger phylogenetic studies DOI Creative Commons
Sharmin Akter Mim, Md Zarif-Ul-Alam, Rezwana Reaz

et al.

Bioinformatics, Journal Year: 2023, Volume and Issue: 39(6)

Published: June 1, 2023

Abstract Motivation With the recent breakthroughs in sequencing technology, phylogeny estimation at a larger scale has become huge opportunity. For accurate of large-scale phylogeny, substantial endeavor is being devoted introducing new algorithms or upgrading current approaches. In this work, we to improve Quartet Fiduccia and Mattheyses (QFM) algorithm resolve phylogenetic trees better quality with running time. QFM was already appreciated by researchers for its good tree quality, but fell short phylogenomic studies due excessively slow Results We have re-designed so that it can amalgamate millions quartets over thousands taxa into species great level accuracy within amount Named “QFM Fast Improved (QFM-FI)”, our version 20 000× faster than previous 400× widely used variant implemented PAUP* on datasets. also provided theoretical analysis time memory requirements QFM-FI. conducted comparative study QFM-FI other state-of-the-art reconstruction methods, such as QFM, QMC, wQMC, wQFM, ASTRAL, simulated well real biological Our results show improves produces are comparable methods. Availability implementation open source available https://github.com/sharmin-mim/qfm_java.

Language: Английский

Citations

4

QT-WEAVER: Correcting quartet distribution improves phylogenomic analyses despite gene tree estimation error DOI Creative Commons
Navid Bin Hasan,

Sohaib,

Md. Shamsuzzoha Bayzid

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: Nov. 12, 2024

Abstract Summarizing individual gene trees into species phylogenies using coalescent-based methods has become a standard approach in phylogenomics. However, tree estimation error (GTEE) arising from combination of reasons (ranging analytical factors to more biological causes, as short sequences) can potentially impact the accuracy phylogenomic inference. We, for first time, introduce problem correcting quartet distribution induced by set estimated trees, which involves updating weights quartets better reflect their relative importance within distribution. We present QT-WEAVER, method its kind, learns conflicts given and generates an updated adjusting accordingly. QT-WEAVER is general- purpose technique needing no explicit modeling subject system or GTEE heterogeneity. Experimental studies on collection simulated empirical data sets suggest that effectively account GTEE, results substantial improvement accuracy. Additionally, concept related algorithmic combinatorial innovations introduced this study will benefit various quartet-based computations. Therefore, advances state-of-the-art face GTEE. freely available open-source form at https://github.com/navidh86/QT-WEAVER .

Language: Английский

Citations

0