PhageScope: a well-annotated bacteriophage database with automatic analyses and visualizations DOI Creative Commons

Ruo Han Wang,

Shuo Yang, Zhixuan Liu

и другие.

Nucleic Acids Research, Год журнала: 2023, Номер 52(D1), С. D756 - D761

Опубликована: Окт. 30, 2023

Abstract Bacteriophages are viruses that infect bacteria or archaea. Understanding the diverse and intricate genomic architectures of phages is essential to study microbial ecosystems develop phage therapy strategies. However, existing databases short meticulous annotations. To this end, we propose PhageScope (https://phagescope.deepomics.org), an online database with comprehensive harbors a collection 873 718 sequences from various sources. Applying fifteen state-of-the-art tools perform systematic annotations analyses, provides on genome completeness, host range, lifestyle information, taxonomy classification, nine types structural functional genetic elements, three comparative studies for curated phages. Additionally, incorporates automatic analyses visualizations customized phages, serving as efficient platform study.

Язык: Английский

ColabFold: making protein folding accessible to all DOI Creative Commons
Milot Mirdita, Konstantin Schütze, Yoshitaka Moriwaki

и другие.

Nature Methods, Год журнала: 2022, Номер 19(6), С. 679 - 682

Опубликована: Май 30, 2022

Abstract ColabFold offers accelerated prediction of protein structures and complexes by combining the fast homology search MMseqs2 with AlphaFold2 or RoseTTAFold. ColabFold’s 40−60-fold faster optimized model utilization enables close to 1,000 per day on a server one graphics processing unit. Coupled Google Colaboratory, becomes free accessible platform for folding. is open-source software available at https://github.com/sokrypton/ColabFold its novel environmental databases are https://colabfold.mmseqs.com .

Язык: Английский

Процитировано

6620

ColabFold - Making protein folding accessible to all DOI Creative Commons
Milot Mirdita, Konstantin Schütze, Yoshitaka Moriwaki

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2021, Номер unknown

Опубликована: Авг. 15, 2021

ColabFold offers accelerated protein structure and complex predictions by combining the fast homology search of MMseqs2 with AlphaFold2 or RoseTTAFold. ColabFold’s 40 - 60× faster optimized model use allows predicting close to a thousand structures per day on server one GPU. Coupled Google Colaboratory, becomes free accessible platform for folding. is open-source software available at github.com/sokrypton/ColabFold . Its novel environmental databases are colabfold.mmseqs.com Contact [email protected] , [email protected] [email protected]

Язык: Английский

Процитировано

555

Horizontal gene transfer and adaptive evolution in bacteria DOI
Brian J. Arnold, I-Ting Huang, William P. Hanage

и другие.

Nature Reviews Microbiology, Год журнала: 2021, Номер 20(4), С. 206 - 218

Опубликована: Ноя. 12, 2021

Язык: Английский

Процитировано

493

Metagenomic compendium of 189,680 DNA viruses from the human gut microbiome DOI Creative Commons
Stephen Nayfach, David Páez-Espino, Lee Call

и другие.

Nature Microbiology, Год журнала: 2021, Номер 6(7), С. 960 - 970

Опубликована: Июнь 24, 2021

Bacteriophages have important roles in the ecology of human gut microbiome but are under-represented reference databases. To address this problem, we assembled Metagenomic Gut Virus catalogue that comprises 189,680 viral genomes from 11,810 publicly available stool metagenomes. Over 75% represent double-stranded DNA phages infect members Bacteroidia and Clostridia classes. Based on sequence clustering identified 54,118 candidate species, 92% which were not found existing The improves detection viruses metagenomes accounts for nearly 40% CRISPR spacers Bacteria Archaea. We also produced a 459,375 protein clusters to explore functional potential virome. This revealed tens thousands diversity-generating retroelements, use error-prone reverse transcription mutate target genes may be involved molecular arms race between their bacterial hosts.

Язык: Английский

Процитировано

406

Petabase-scale sequence alignment catalyses viral discovery DOI Creative Commons
R. C. Edgar,

Brie Taylor,

Victor S.-Y. Lin

и другие.

Nature, Год журнала: 2022, Номер 602(7895), С. 142 - 147

Опубликована: Янв. 26, 2022

Public databases contain a planetary collection of nucleic acid sequences, but their systematic exploration has been inhibited by lack efficient methods for searching this corpus, which (at the time writing) exceeds 20 petabases and is growing exponentially1. Here we developed cloud computing infrastructure, Serratus, to enable ultra-high-throughput sequence alignment at petabase scale. We searched 5.7 million biologically diverse samples (10.2 petabases) hallmark gene RNA-dependent RNA polymerase identified well over 105 novel viruses, thereby expanding number known species roughly an order magnitude. characterized viruses related coronaviruses, hepatitis delta virus huge phages, respectively, analysed environmental reservoirs. To catalyse ongoing revolution viral discovery, established free comprehensive database these data tools. Expanding diversity can reveal evolutionary origins emerging pathogens improve pathogen surveillance anticipation mitigation future pandemics.

Язык: Английский

Процитировано

381

IMG/VR v4: an expanded database of uncultivated virus genomes within a framework of extensive functional, taxonomic, and ecological metadata DOI Creative Commons
Antônio Pedro Camargo, Stephen Nayfach, I-Min A. Chen

и другие.

Nucleic Acids Research, Год журнала: 2022, Номер 51(D1), С. D733 - D743

Опубликована: Ноя. 18, 2022

Viruses are widely recognized as critical members of all microbiomes. Metagenomics enables large-scale exploration the global virosphere, progressively revealing extensive genomic diversity viruses on Earth and highlighting myriad ways by which impact biological processes. IMG/VR provides access to largest collection viral sequences obtained from (meta)genomes, along with functional annotation rich metadata. A web interface users efficiently browse search based genome features and/or sequence similarity. Here, we present fourth version IMG/VR, composed >15 million virus genomes fragments, a ≈6-fold increase in size compared previous version. These clustered into 8.7 operational taxonomic units, including 231 408 at least one high-quality representative. Viral now systematically identified genomes, metagenomes, metatranscriptomes using new detection approach (geNomad), IMG standard complemented quality estimation CheckV, classification reflecting latest standards, microbial host taxonomy prediction. v4 is available https://img.jgi.doe.gov/vr, underlying data download https://genome.jgi.doe.gov/portal/IMG_VR.

Язык: Английский

Процитировано

231

mobileOG-db: a Manually Curated Database of Protein Families Mediating the Life Cycle of Bacterial Mobile Genetic Elements DOI Creative Commons
Connor Brown,

James Mullet,

Fadi Hindi

и другие.

Applied and Environmental Microbiology, Год журнала: 2022, Номер 88(18)

Опубликована: Авг. 29, 2022

Bacterial mobile genetic elements (MGEs) encode functional modules that perform both core and accessory functions for the element, latter of which are often only transiently associated with element. The presence these genes, close homologs to primarily immobile incur high rates false positives and, therefore, limits usability databases MGE annotation. To overcome this limitation, we analyzed 10,776,849 protein sequences derived from eight compile a comprehensive set 6,140 manually curated families linked "life cycle" (integration/excision, replication/recombination/repair, transfer, stability/transfer/defense, phage-specific processes) plasmids, phages, integrative, transposable, conjugative elements. We overlay experimental information where available create tiered annotation scheme high-quality annotations inferred exclusively through bioinformatic evidence. additionally provide an MGE-class label each entry (e.g., plasmid or integrative element), assign major minor category. resulting database, mobileOG-db (for orthologous groups), comprises over 700,000 deduplicated encompassing five mobileOG categories more than 50 categories, providing structured language interpretable basis array MGE-centered analyses. can be accessed at mobileogdb.flsi.cloud.vt.edu/, users select, refine, analyze custom subsets dynamic mobilome. IMPORTANCE analysis bacterial in genomic data is critical step toward profiling root causes antibiotic resistance, phenotypic metabolic diversity, evolution genera. Existing methods pose barriers biological computational expertise properly harness. bridge gap, systematically proteins MGEs identify serve as candidate hallmarks, i.e., used "signatures" aid resource, mobileOG-db, provides multilevel classification encompasses plasmid, phage, transposable element categorized into categories. thus rich resource simple intuitive integrated seamlessly existing detection pipelines colocalization

Язык: Английский

Процитировано

207

BACPHLIP: predicting bacteriophage lifestyle from conserved protein domains DOI Creative Commons
Adam J. Hockenberry, Claus O. Wilke

PeerJ, Год журнала: 2021, Номер 9, С. e11396 - e11396

Опубликована: Май 6, 2021

Bacteriophages are broadly classified into two distinct lifestyles: temperate and virulent. Temperate phages capable of a latent phase infection within host cell (lysogenic cycle), whereas virulent directly replicate lyse cells upon (lytic cycle). Accurate lifestyle identification is critical for determining the role individual phage species ecosystems their effect on evolution. Here, we present BACPHLIP, BACterioPHage LIfestyle Predictor. BACPHLIP detects presence set conserved protein domains an input genome uses this data to predict via Random Forest classifier that was trained dataset 634 genomes. On independent test 423 phages, has accuracy 98% greatly exceeding previously existing tools (79%). freely available GitHub ( https://github.com/adamhockenberry/bacphlip ) code used build provided in separate repository https://github.com/adamhockenberry/bacphlip-model-dev users wishing interrogate re-train underlying classification model.

Язык: Английский

Процитировано

192

A catalog of tens of thousands of viruses from human metagenomes reveals hidden associations with chronic diseases DOI Creative Commons
Michael J. Tisza, Christopher B. Buck

Proceedings of the National Academy of Sciences, Год журнала: 2021, Номер 118(23)

Опубликована: Июнь 3, 2021

Significance Mechanisms of many human chronic diseases involve abnormal action the immune system and/or altered metabolism. The microbiome, an important regulator metabolic and immune-related phenotypes, has been shown to be associated with or participate in development a variety diseases. Viruses bacteria (i.e., “phages”) are ubiquitous mysterious, several studies have that phages exert great control over behavior—and misbehavior—of their host bacteria. This study uses techniques discover analyze 45,000 viruses bodies. abundance 2,000 specific is found correlate common

Язык: Английский

Процитировано

192

Alterations in the Gut Virome in Obesity and Type 2 Diabetes Mellitus DOI Open Access
Keli Yang, Junkun Niu, Tao Zuo

и другие.

Gastroenterology, Год журнала: 2021, Номер 161(4), С. 1257 - 1269.e13

Опубликована: Июнь 25, 2021

Язык: Английский

Процитировано

146