Accurate and fast graph-based pangenome annotation and clustering with ggCaller DOI Creative Commons
Samuel Horsfield, Gerry Tonkin‐Hill, Nicholas J. Croucher

и другие.

Genome Research, Год журнала: 2023, Номер 33(9), С. 1622 - 1637

Опубликована: Авг. 24, 2023

Bacterial genomes differ in both gene content and sequence mutations, which underlie extensive phenotypic diversity, including variation susceptibility to antimicrobials or vaccine-induced immunity. To identify quantify important variants, all genes within a population must be predicted, functionally annotated, clustered, representing the “pangenome.” Despite volume of genome data available, prediction annotation are currently conducted isolation on individual genomes, is computationally inefficient frequently inconsistent across genomes. Here, we introduce open-source software graph-gene-caller (ggCaller). ggCaller combines prediction, functional annotation, clustering into single workflow using population-wide de Bruijn graphs, removing redundancy resulting more accurate predictions orthologue clustering. We applied simulated real-world bacterial sets containing hundreds thousands comparing it current state-of-the-art tools. has considerable speed-ups with equivalent greater accuracy, particularly complex sources error, such as assembly contamination fragmentation. also an extension genome-wide association studies, enabling querying annotated graphs for analyses. highlight this application by annotating DNA sequences significant associations tetracycline macrolide resistance Streptococcus pneumoniae , identifying key determinants that were missed when only reference genome. novel analysis tool applications evolution epidemiology.

Язык: Английский

Producing polished prokaryotic pangenomes with the Panaroo pipeline DOI Creative Commons
Gerry Tonkin‐Hill, Neil MacAlasdair, Christopher Ruis

и другие.

Genome biology, Год журнала: 2020, Номер 21(1)

Опубликована: Июль 22, 2020

Abstract Population-level comparisons of prokaryotic genomes must take into account the substantial differences in gene content resulting from horizontal transfer, duplication and loss. However, automated annotation is imperfect, errors due to fragmented assemblies, contamination, diverse families mis-assemblies accumulate over population, leading profound consequences when analysing set all genes found a species. Here, we introduce Panaroo, graph-based pangenome clustering tool that able for many sources error introduced during genome assemblies. Panaroo available at https://github.com/gtonkinhill/panaroo .

Язык: Английский

Процитировано

709

Horizontal gene transfer and adaptive evolution in bacteria DOI
Brian J. Arnold, I-Ting Huang, William P. Hanage

и другие.

Nature Reviews Microbiology, Год журнала: 2021, Номер 20(4), С. 206 - 218

Опубликована: Ноя. 12, 2021

Язык: Английский

Процитировано

498

Fast and flexible bacterial genomic epidemiology with PopPUNK DOI Creative Commons
John A. Lees, Simon R. Harris, Gerry Tonkin‐Hill

и другие.

Genome Research, Год журнала: 2019, Номер 29(2), С. 304 - 316

Опубликована: Янв. 24, 2019

The routine use of genomics for disease surveillance provides the opportunity high-resolution bacterial epidemiology. Current whole-genome clustering and multilocus typing approaches do not fully exploit core accessory genomic variation, they cannot both automatically identify, subsequently expand, clusters significantly similar isolates in large data sets spanning entire species. Here, we describe PopPUNK (

Язык: Английский

Процитировано

348

Population genomics of bacterial host adaptation DOI
Samuel K. Sheppard, David S. Guttman, J. Ross Fitzgerald

и другие.

Nature Reviews Genetics, Год журнала: 2018, Номер 19(9), С. 549 - 565

Опубликована: Июль 4, 2018

Язык: Английский

Процитировано

223

International genomic definition of pneumococcal lineages, to contextualise disease, antibiotic resistance and vaccine impact DOI Creative Commons
Rebecca A. Gladstone, Stephanie W. Lo, John A. Lees

и другие.

EBioMedicine, Год журнала: 2019, Номер 43, С. 338 - 346

Опубликована: Апрель 16, 2019

Pneumococcal conjugate vaccines have reduced the incidence of invasive pneumococcal disease, caused by vaccine serotypes, but non-vaccine-serotypes remain a concern. We used whole genome sequencing to study serotype, antibiotic resistance and invasiveness, in context genetic background.Our dataset 13,454 genomes, combined with four published genomic datasets, represented Africa (40%), Asia (25%), Europe (19%), North America (12%), South (5%). These 20,027 genomes were clustered into lineages using PopPUNK, named Global Sequence Clusters (GPSCs). From our dataset, we additionally derived serotype sequence type, predicted sensitivity. then measured invasiveness odds ratios that relating prevalence disease carriage.The collections (n = 20,027) 621 GPSCs. Thirty-five GPSCs observed >100 isolates, subsequently classed as dominant-GPSCs. In 22/35 (63%) dominant-GPSCs both non-vaccine serotypes years up until, including, first year introduction. Penicillin multidrug higher (p < .05) subset (14/35, 9/35 respectively), an increasing number classes was associated increased recombination (R2 0.27 p .0001). 28/35 dominant-GPSCs, country isolation significant predictor its antibiogram (mean misclassification error 0.28, SD ± 0.13). detected six backgrounds, when compared other backgrounds expressing same serotype. Up 1.6-fold changes ratio observed.We define can be assigned any aid international comparisons. Existing most preclude removal these vaccines; leaving potential for replacement. A resistance, and/or serotype-independent invasiveness.

Язык: Английский

Процитировано

205

Atlas of group A streptococcal vaccine candidates compiled using large-scale comparative genomics DOI Open Access
Mark R. Davies, Liam McIntyre, Ankur Mutreja

и другие.

Nature Genetics, Год журнала: 2019, Номер 51(6), С. 1035 - 1043

Опубликована: Май 27, 2019

Язык: Английский

Процитировано

176

Genomics and pathotypes of the many faces ofEscherichia coli DOI Creative Commons
Jeroen Geurtsen,

Mark de Been,

Eveline Weerdenburg

и другие.

FEMS Microbiology Reviews, Год журнала: 2022, Номер 46(6)

Опубликована: Июнь 24, 2022

Escherichia coli is the most researched microbial organism in world. Its varied impact on human health, consisting of commensalism, gastrointestinal disease, or extraintestinal pathologies, has generated a separation species into at least eleven pathotypes (also known as pathovars). These are broadly split two groups, intestinal pathogenic E. (InPEC) and (ExPEC). However, components coli's infinite open accessory genome horizontally transferred with substantial frequency, creating hybrid strains that defy clear pathotype designation. Here, we take birds-eye view species, characterizing it from historical, clinical, genetic perspectives. We examine wide spectrum disease caused by coli, content bacterium, its propensity to acquire, exchange, maintain antibiotic resistance genes virulence traits. Our portrayal also discusses elements have shaped overall population structure summarizes current state vaccine development targeted frequent pathovars. In our conclusions, advocate streamlining efforts for clinical reporting ExPEC, emphasize potential exists throughout entire species.

Язык: Английский

Процитировано

100

Diversification of Colonization Factors in a Multidrug-Resistant Escherichia coli Lineage Evolving under Negative Frequency-Dependent Selection DOI Creative Commons
Alan McNally, Teemu Kallonen, Christopher Connor

и другие.

mBio, Год журнала: 2019, Номер 10(2)

Опубликована: Апрель 22, 2019

Infections with multidrug-resistant (MDR) strains of Escherichia coli are a significant global public health concern. To combat these pathogens, we need deeper understanding how they evolved from their background populations. By the processes that underpin emergence, can design new strategies to limit evolution clones and existing clones. combining population genomics modelling approaches, show dominant MDR E. under influence negative frequency-dependent selection, preventing them rising fixation in population. Furthermore, this selection acts on genes involved anaerobic metabolism, suggesting key trait, ability colonize human intestinal tracts, is step .

Язык: Английский

Процитировано

121

Improved Prediction of Bacterial Genotype-Phenotype Associations Using Interpretable Pangenome-Spanning Regressions DOI Creative Commons
John A. Lees, The Tien Mai, Marco Galardini

и другие.

mBio, Год журнала: 2020, Номер 11(4)

Опубликована: Июль 6, 2020

Being able to identify the genetic variants responsible for specific bacterial phenotypes has been goal of genetics since its inception and is fundamental our current level understanding bacteria. This identification based primarily on painstaking experimentation, but availability large data sets whole genomes with associated phenotype metadata promises revolutionize this approach, not least important clinical that are amenable laboratory analysis. These models phenotype-genotype association can in future be used rapid prediction clinically such as antibiotic resistance virulence by rapid-turnaround or point-of-care tests. However, despite much effort being put into adapting genome-wide study (GWAS) approaches cope bacterium-specific problems, strong population structure horizontal gene exchange, yet optimal. We describe a method advances methodology both generation portable models.

Язык: Английский

Процитировано

101

Global emergence and population dynamics of divergent serotype 3 CC180 pneumococci DOI Creative Commons
Taj Azarian, Patrick K. Mitchell, Maria Georgieva

и другие.

PLoS Pathogens, Год журнала: 2018, Номер 14(11), С. e1007438 - e1007438

Опубликована: Ноя. 26, 2018

Streptococcus pneumoniae serotype 3 remains a significant cause of morbidity and mortality worldwide, despite inclusion in the 13-valent pneumococcal conjugate vaccine (PCV13). Serotype increased carriage since implementation PCV13 USA, while invasive disease rates remain unchanged. We investigated persistence disease, through genomic analyses global sample 301 isolates Netherlands3-31 (PMEN31) clone CC180, combined with associated patient data PCV utilization among countries isolate collection. assessed phenotypic variation between dominant clades capsule charge (zeta potential), capsular polysaccharide shedding, susceptibility to opsonophagocytic killing, which have previously been duration, invasiveness, escape. identified recent shift CC180 population attributed lineage termed Clade II, was estimated by Bayesian coalescent analysis first appeared 1968 [95% HPD: 1939-1989] prevalence effective size thereafter. II are divergent from pre-PCV13 non-capsular antigenic composition, competence, antibiotic susceptibility, last resulting acquisition Tn916-like conjugative transposon. Differences recombination correlated variations ATP-binding subunit Clp protease, as well amino acid substitutions comCDE operon. Opsonophagocytic killing assays elucidated low observed efficacy against 3. Variation use sampled not independently shift; therefore, genotypic differences protein antigens and, particular, resistance may contributed increase II. Our emphasizes need for routine, representative sampling disperse geographic regions, including historically under-sampled areas. also highlight value genomics resolving epidemiological within serotype, implications future development.

Язык: Английский

Процитировано

92