PhyloJunction: a computational framework for simulating, developing, and teaching evolutionary models DOI Creative Commons
Fábio K. Mendes, Michael J. Landis

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2023, Volume and Issue: unknown

Published: Dec. 16, 2023

We introduce PhyloJunction, a computational framework designed to facilitate the prototyping, testing, and characterization of evolutionary models. PhyloJunction is distributed as an open-source Python library that can be used implement variety models, through its flexible graphical modeling architecture dedicated model specification language. Model design use are exposed users via command-line interfaces, which integrate steps simulating, summarizing, visualizing data. This paper describes features - include, but not limited to, general implementation popular family phylogenetic diversification models and, moving forward, how it may expanded only include new also become platform for conducting teaching statistical learning.

Language: Английский

Whole-genome resequencing of Chinese indigenous sheep provides insight into the genetic basis underlying climate adaptation DOI Creative Commons
Meilin Jin, Huihua Wang, Gang Liu

et al.

Genetics Selection Evolution, Journal Year: 2024, Volume and Issue: 56(1)

Published: April 2, 2024

Abstract Background Chinese indigenous sheep are valuable resources with unique features and characteristics. They distributed across regions different climates in mainland China; however, few reports have analyzed the environmental adaptability of based on their genome. We examined variants signatures selection involved adaptation to extreme humidity, altitude, temperature conditions 173 genomes from 41 phenotypically geographically representative breeds characterize genetic basis underlying these populations. Results Based analysis population structure, we inferred that divided into four groups: Kazakh (KAZ), Mongolian (MON), Tibetan (TIB), Yunnan (YUN). also detected a set candidate genes relevant conditions, such as drought-prone ( TBXT , TG HOXA1 ), high-altitude DYSF EPAS1 JAZF1 PDGFD NF1 ) warm-temperature TSHR ABCD4 TEX11 ). Among all genes, eight CNTN4 DOCK10 LOC105608545 LOC121816479 SEM3A SVIL overlap between conditions. The gene shows strong signature for positive group harbors single nucleotide polymorphism (SNP) missense mutation located positions 90,600,001 90,650,001 chromosome 7, which leads change protein structure influences its stability. Conclusions Analysis uncovered likely related SNP affects It provides information evolution phylogeographic These results provide important future breeding studies new perspectives how animals can adapt climate change.

Language: Английский

Citations

13

Subfamily evolution analysis using nuclear and chloroplast data from the same reads DOI Creative Commons
Eranga Pawani Witharana, Takaya Iwasaki, Myat Htoo San

et al.

Scientific Reports, Journal Year: 2025, Volume and Issue: 15(1)

Published: Jan. 3, 2025

The chloroplast (cp) genome is a widely used tool for exploring plant evolutionary relationships, yet its effectiveness in fully resolving these relationships remains uncertain. Integrating cp data with nuclear DNA information offers more comprehensive view but often requires separate datasets. In response, we employed the same raw read sequencing to construct genome-based trees and phylogenetic using Read2Tree, cost-efficient method extracting conserved gene sequences from data, focusing on Aurantioideae subfamily, which includes Citrus relatives. resulting were consistent existing derived high-throughput sequencing, diverged trees. To elucidate underlying complex processes causing discordances, implemented an integrative workflow that utilized multiple alignments of each generated by conjunction other phylogenomic methods. Our analysis revealed incomplete lineage sorting predominantly drives while introgression ancient also contribute topological discrepancies within certain clades. This study underscores cost-effectiveness both analyses understanding relationships.

Language: Английский

Citations

1

Evaluating the Performance of Widely Used Phylogenetic Models for Gene Expression Evolution DOI Creative Commons
Jose Rafael Dimayacyac, Shanyun Wu, Daohan Jiang

et al.

Genome Biology and Evolution, Journal Year: 2023, Volume and Issue: 15(12)

Published: Nov. 24, 2023

Abstract Phylogenetic comparative methods are increasingly used to test hypotheses about the evolutionary processes that drive divergence in gene expression among species. However, it is unknown whether distributional assumptions of phylogenetic models designed for quantitative phenotypic traits realistic data and importantly, reliability conclusions studies may depend on well described by chosen model. To evaluate this, we first fit several trait evolution 8 previously published datasets, comprising a total 54,774 genes with 145,927 unique gene–tissue combinations. Using developed approach, then assessed how best model set an absolute (not just relative) sense. First, find Ornstein–Uhlenbeck models, which values constrained around optimum, were preferred 66% Second, 61% combinations, best-fit was found perform well; rest be performing poorly at least one statistics examined. Third, when simple do not well, this appears typically consequence failing fully account heterogeneity rate evolution. We advocate assessment performance should become routine component studies; doing so can improve inferences inspire development novel models.

Language: Английский

Citations

11

Unifying approaches from statistical genetics and phylogenetics for mapping phenotypes in structured populations DOI Creative Commons
Joshua G. Schraiber, Michael D. Edge,

Matt Pennell

et al.

PLoS Biology, Journal Year: 2024, Volume and Issue: 22(10), P. e3002847 - e3002847

Published: Oct. 9, 2024

In both statistical genetics and phylogenetics, a major goal is to identify correlations between genetic loci or other aspects of the phenotype environment focal trait. these 2 fields, there are sophisticated but disparate traditions aimed at tasks. The disconnect their respective approaches becoming untenable as questions in medicine, conservation biology, evolutionary biology increasingly rely on integrating data from within among species, once-clear conceptual divisions blurred. To help bridge this divide, we lay out general model describing covariance contributions quantitative phenotypes different individuals. Taking approach shows that standard models (e.g., genome-wide association studies; GWAS) phylogenetic comparative regression) can be interpreted special cases more quantitative-genetic model. fact share same core architecture means build unified understanding strengths limitations methods for controlling structure when testing associations. We develop intuition why spurious may occur analytically conduct population-genetic simulations traits. structural similarity problems phylogenetics enables us take methodological advances one field apply them other. demonstrate by showing how GWAS technique-including relatedness matrix (GRM) well its leading eigenvectors, corresponding principal components genotype matrix, regression model-can mitigate analyses. As case study, re-examine an analysis coevolution expression levels genes across fungal phylogeny show including eigenvectors covariates decreases false positive rate while simultaneously increasing true rate. More generally, work provides foundation integrative processes shape it.

Language: Английский

Citations

4

The meaning and measure of concordance factors in phylogenomics DOI Creative Commons
Robert Lanfear, Matthew W. Hahn

Molecular Biology and Evolution, Journal Year: 2024, Volume and Issue: 41(11)

Published: Oct. 17, 2024

Abstract As phylogenomic datasets have grown in size, researchers developed new ways to measure biological variation and assess statistical support for specific branches. Larger more sites loci therefore less sampling variance. While we can accurately the mean signal these datasets, lower variance is often reflected uniformly high measures of branch support—such as bootstrap posterior probability—limiting their utility. also revealed substantial topologies found across individual loci, such that single species tree inferred by most phylogenetic methods represents a limited summary data many purposes. In contrast support, degree underlying topological among should be approximately constant regardless size dataset. “Concordance factors” (CFs) similar statistics become increasingly important tools phylogenetics. this review, explain why CFs thought descriptors rather than argue they provide information about predictive power not contained support. We review growing suite measuring concordance, compare them common framework reveals interrelationships, demonstrate how calculate using an example from birds. discuss might change future move beyond estimating “tree life” toward myriad evolutionary histories genomic variation.

Language: Английский

Citations

4

Selecting a Window Size for the Analysis of Whole Genome Alignments using AIC DOI Creative Commons
Jeremias Ivan, Paul B. Frandsen, Robert Lanfear

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2025, Volume and Issue: unknown

Published: Feb. 7, 2025

A bstract The variation of evolutionary histories along the genome presents a challenge for phylogenomic methods to identify non-recombining regions and reconstruct phylogenetic tree each region. To address this problem, many studies used non-overlapping window approach, often with an arbitrary selection fixed sizes that potentially include intra-window recombination events. In study, we proposed information theoretic approach select size best reflects underlying alignment. First, simulated chromosome alignments reflected key characteristics empirical dataset found AIC is good predictor accuracy in correctly recovering topologies Due issue missing data datasets, then designed stepwise applied method genomes erato - sara Heliconius butterflies great apes. We butterflies’ chromosomes ranged from < 125bp 250bp, which are much shorter than those previous study even though difference did not significantly change most common across genome. On other hand, apes’ 500bp 1kb proportion major topology (grouping human chimpanzee) falling between 60% 87%, consistent findings. Additionally, observed notable impact stochastic error concatenation when using small large windows, respectively. For instance, apes was 50% 250bp but reached almost 100% 64kb windows. conclusion, our highlights challenges associated selecting analyses proposes as more objective way optimal whole alignments.

Language: Английский

Citations

0

Reconciling Gene Tree Discordance and Biogeography in European Crows DOI Creative Commons
Chyi Yin Gwee, Dirk Metzler, Jérôme Fuchs

et al.

Molecular Ecology, Journal Year: 2025, Volume and Issue: unknown

Published: April 10, 2025

ABSTRACT Reconstructing the evolutionary history of young lineages diverging with gene flow is challenging due to factors like incomplete lineage sorting, introgression, and selection causing tree discordance. The European crow hybrid zone between all‐black carrion crows grey‐coated hooded exemplifies this challenge. Most genome in Western Central populations near‐identical crows, but differs substantially from their Iberian congeners. A notable exception a single major‐effect colour‐locus under sexual aligning ‘species’ tree. To understand underlying processes, we reconstructed biogeographic species complex. During Pleistocene took refuge Peninsula Middle East, respectively. Allele‐sharing likewise black at represents last trace ancestry, resisting expanding that have homogenised most genome. model introgression an ancestor into near Pyrenées was significantly less supported. We found no positive relationship recombination rate consistent absence genome‐wide, polygenic barriers Overall, study portrays scenario where few large‐effect loci, subject divergent selection, resist rampant asymmetric exchange. This underscores importance integrating population demography biogeography accurately interpret patterns discordance following divergence.

Language: Английский

Citations

0

Homoplastic versus xenoplastic evolution: exploring the emergence of key intrinsic and extrinsic traits in the montane genus Soldanella (Primulaceae) DOI Creative Commons
Ivan Rurik, Andrea Melichárková,

Eliška Gbúrová Štubová

et al.

The Plant Journal, Journal Year: 2024, Volume and Issue: 118(3), P. 753 - 765

Published: Jan. 13, 2024

SUMMARY Specific ecological conditions in the high mountain environment exert a selective pressure that often leads to convergent trait evolution. Reticulations induced by incomplete lineage sorting and introgression can lead discordant patterns among gene species trees (hemiplasy/xenoplasy), providing false illusion traits under study are homoplastic. Using phylogenetic networks, we explored effect of exchange on evolution Soldanella , genus profoundly influenced historical introgression. At least three features evolved independently multiple times: single‐flowered dwarf phenotype, dysploid cytotype, generalism. The present analyses also indicated recurring occurrence stoloniferous growth might have been prompted an event between ancestral still extant species, although its emergence via cannot be completely ruled out. Phylogenetic regression suggested independent larger genomes snowbells is most likely result interplay hybridization events euploid taxa hostile environments at range margins genus. key intrinsic extrinsic has significantly impacted not only but recent events.

Language: Английский

Citations

3

A tale of too many trees: a conundrum for phylogenetic regression DOI Creative Commons

Richard H. Adams,

Jenniffer Roa Lozano,

Mataya Duncan

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: Feb. 20, 2024

A bstract Just exactly which tree(s) should we assume when testing evolutionary hypotheses? This question has plagued comparative biologists for decades. Given a perfectly estimated tree (if this is even possible in practice), seldom know with certainty whether such truly best (or adequate) to represent the history of our studied traits. Regardless certainty, choosing required all phylogenetic methods. Yet, conflict and error are ubiquitous modern biology, still learning about their dangers hypotheses. Here investigated consequences gene tree-species mismatch regression presence incomplete lineage sorting. Our simulation experiments reveal excessively high false positive rates mismatched both small large trees, simple complex traits, known phylogenies. In some cases, find evidence directionality error: incorrectly assuming species traits that evolved according sometimes fares worse than opposite. To explore difficult yet realistic scenarios, also used rather trees conduct case studies, as well an expansive expression dataset investigate arguably best-case scenario one may have better chance match trait. Though never meant be panacea ail methods, found promise application robust estimator potential, albeit imperfect, solution issues raised by mismatch, perhaps offering path forward. Collectively, results emphasize importance careful study design highlighting need fully appreciate role adequate modeling

Language: Английский

Citations

3

reconcILS: A gene tree-species tree reconciliation algorithm that allows for incomplete lineage sorting DOI Creative Commons
Sarthak Mishra, Megan L. Smith, Matthew W. Hahn

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2023, Volume and Issue: unknown

Published: Nov. 5, 2023

Abstract Reconciliation algorithms provide an accounting of the evolutionary history individual gene trees given a species tree. Many reconciliation consider only duplication and loss events (and sometimes horizontal transfer), ignoring effects coalescent process, including incomplete lineage sorting (ILS). Here, we present new heuristic algorithm for carrying out that accurately accounts ILS by treating it as series nearest neighbor interchange (NNI) events. For discordant branches tree identified last common ancestor (LCA) mapping, our recursively chooses optimal comparing cost to NNI loss. We demonstrate accuracy method, which call reconcILS , using simulation engine ( dupcoal ) can generate produced interaction duplication, loss, ILS. Despite being show is much more accurate than models ignore ILS, at least or better leading methods model while also able handle larger datasets. use applying dataset 23 primate genomes, highlighting its compared standard in presence large amounts

Language: Английский

Citations

4