Ranking environmental and edaphic attributes driving soil microbial community structure and activity with special attention to spatial and temporal scales DOI Creative Commons
V. V. S. R. Gupta, James M. Tiedje

mLife, Journal Year: 2024, Volume and Issue: 3(1), P. 21 - 41

Published: March 1, 2024

Abstract The incredibly complex soil microbial communities at small scales make their analysis and identification of reasons for the observed structures challenging. Microbial community structure is mainly a result inoculum (dispersal), selective advantages those organisms under habitat‐based environmental attributes, ability colonizers to sustain themselves over time. Since protective, its inhabitants have long adapted varied conditions, significant portions are likely stable. Hence, substantial portion will not correlate often measured attributes. We suggest that drivers be ranked on basis importance fundamental needs microbes: (i) supply energy, i.e., organic carbon electron acceptors; (ii) effectors or stressors, pH, salt, drought, toxic chemicals; (iii) macro‐organism associations, plants seasonality, animals fecal matter, fauna; (iv) nutrients, in order, N, P, probably lesser importance, other micronutrients, metals. relevance also varies with spatial time scales, example, aggregate field regional, persistent dynamic populations transcripts, extent phylogenetic difference, hence phenotypic differences organismal groups. present summary matrix provide guidance which important particular studies, special emphasis wide range temporal illustrate this genomic population (rRNA gene) data from selected studies.

Language: Английский

Evolutionary-scale prediction of atomic-level protein structure with a language model DOI Creative Commons
Zeming Lin, Halil Akin, Roshan Rao

et al.

Science, Journal Year: 2023, Volume and Issue: 379(6637), P. 1123 - 1130

Published: March 16, 2023

Recent advances in machine learning have leveraged evolutionary information multiple sequence alignments to predict protein structure. We demonstrate direct inference of full atomic-level structure from primary using a large language model. As models sequences are scaled up 15 billion parameters, an atomic-resolution picture emerges the learned representations. This results order-of-magnitude acceleration high-resolution prediction, which enables large-scale structural characterization metagenomic proteins. apply this capability construct ESM Metagenomic Atlas by predicting structures for >617 million sequences, including >225 that predicted with high confidence, gives view into vast breadth and diversity natural

Language: Английский

Citations

2210

Assessment of global health risk of antibiotic resistance genes DOI Creative Commons
Zhenyan Zhang, Qi Zhang, Tingzhang Wang

et al.

Nature Communications, Journal Year: 2022, Volume and Issue: 13(1)

Published: March 23, 2022

Antibiotic resistance genes (ARGs) have accelerated microbial threats to human health in the last decade. Many can confer resistance, but evaluating relative risks of ARGs is complex. Factors such as abundance, propensity for lateral transmission and ability be expressed pathogens are all important. Here, an analysis at metagenomic level from various habitats (6 types habitats, 4572 samples) detects 2561 that collectively conferred 24 classes antibiotics. We quantitatively evaluate risk humans, defined will confound clinical treatment pathogens, these by integrating accessibility, mobility, pathogenicity availability. Our results demonstrate 23.78% pose a risk, especially those which multidrug resistance. also calculate antibiotic samples four main with machine learning, successfully map global marine over 75% accuracy. novel method surveilling help manage one most important animal health.

Language: Английский

Citations

527

Petabase-scale sequence alignment catalyses viral discovery DOI Creative Commons
R. C. Edgar,

Brie Taylor,

Victor S.-Y. Lin

et al.

Nature, Journal Year: 2022, Volume and Issue: 602(7895), P. 142 - 147

Published: Jan. 26, 2022

Public databases contain a planetary collection of nucleic acid sequences, but their systematic exploration has been inhibited by lack efficient methods for searching this corpus, which (at the time writing) exceeds 20 petabases and is growing exponentially1. Here we developed cloud computing infrastructure, Serratus, to enable ultra-high-throughput sequence alignment at petabase scale. We searched 5.7 million biologically diverse samples (10.2 petabases) hallmark gene RNA-dependent RNA polymerase identified well over 105 novel viruses, thereby expanding number known species roughly an order magnitude. characterized viruses related coronaviruses, hepatitis delta virus huge phages, respectively, analysed environmental reservoirs. To catalyse ongoing revolution viral discovery, established free comprehensive database these data tools. Expanding diversity can reveal evolutionary origins emerging pathogens improve pathogen surveillance anticipation mitigation future pandemics.

Language: Английский

Citations

389

Genomes on a Tree (GoaT): A versatile, scalable search engine for genomic and sequencing project metadata across the eukaryotic tree of life DOI Creative Commons
Richard Challis, Sujai Kumar, Cibele G. Sotero-Caio

et al.

Wellcome Open Research, Journal Year: 2023, Volume and Issue: 8, P. 24 - 24

Published: Jan. 17, 2023

As genomic data transform our understanding of biodiversity, the Earth BioGenome Project (EBP) has set a goal generating reference quality genome assemblies for all ~1.9 million described eukaryotic taxa. Meeting this requires coordination among many individual regional and taxon-focussed projects working under EBP umbrella. Large-scale sequencing require ready access to validated genome-relevant metadata, such as sizes karyotypes, but these are dispersed across literature, directly measured values lacking most To meet needs, we have developed Genomes on Tree (GoaT), an Elasticsearch-powered datastore search index metadata project plans statuses. GoaT indexes publicly available species interpolates missing through phylogenetic comparison. also holds target priority status information affiliated aid coordination. Metadata attributes in can be queried mature API, web front end, command line interface. The end additionally provides summary visualisations exploration reporting (see https://goat.genomehubs.org). currently direct or estimated over 70 taxon 30 assembly 1.5 species. depth breadth curated data, frequent updates, versatile query interface make powerful aggregator portal explore report underlying tree life. We illustrate utility series use cases from planning completion genome-sequencing project.

Language: Английский

Citations

337

Evolutionary-scale prediction of atomic level protein structure with a language model DOI Creative Commons
Zeming Lin, Halil Akin, Roshan Rao

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2022, Volume and Issue: unknown

Published: July 21, 2022

Abstract Artificial intelligence has the potential to open insight into structure of proteins at scale evolution. It only recently been possible extend protein prediction two hundred million cataloged proteins. Characterizing structures exponentially growing billions sequences revealed by large gene sequencing experiments would necessitate a break-through in speed folding. Here we show that direct inference from primary sequence using language model enables an order magnitude speed-up high resolution prediction. Leveraging models learn evolutionary patterns across millions sequences, train up 15B parameters, largest date. As are scaled they information three-dimensional individual atoms. This results is 60x faster than state-of-the-art while maintaining and accuracy. Building on this, present ESM Metage-nomic Atlas. first large-scale structural characterization metagenomic proteins, with more 617 structures. The atlas reveals 225 confidence predictions, including whose novel comparison experimentally determined structures, giving unprecedented view vast breadth diversity some least understood earth.

Language: Английский

Citations

260

IMG/VR v4: an expanded database of uncultivated virus genomes within a framework of extensive functional, taxonomic, and ecological metadata DOI Creative Commons
Antônio Pedro Camargo, Stephen Nayfach, I-Min A. Chen

et al.

Nucleic Acids Research, Journal Year: 2022, Volume and Issue: 51(D1), P. D733 - D743

Published: Nov. 18, 2022

Viruses are widely recognized as critical members of all microbiomes. Metagenomics enables large-scale exploration the global virosphere, progressively revealing extensive genomic diversity viruses on Earth and highlighting myriad ways by which impact biological processes. IMG/VR provides access to largest collection viral sequences obtained from (meta)genomes, along with functional annotation rich metadata. A web interface users efficiently browse search based genome features and/or sequence similarity. Here, we present fourth version IMG/VR, composed >15 million virus genomes fragments, a ≈6-fold increase in size compared previous version. These clustered into 8.7 operational taxonomic units, including 231 408 at least one high-quality representative. Viral now systematically identified genomes, metagenomes, metatranscriptomes using new detection approach (geNomad), IMG standard complemented quality estimation CheckV, classification reflecting latest standards, microbial host taxonomy prediction. v4 is available https://img.jgi.doe.gov/vr, underlying data download https://genome.jgi.doe.gov/portal/IMG_VR.

Language: Английский

Citations

241

The IMG/M data management and analysis system v.7: content updates and new features DOI Creative Commons
I-Min A. Chen, Ken Chu, Krishna Palaniappan

et al.

Nucleic Acids Research, Journal Year: 2022, Volume and Issue: 51(D1), P. D723 - D732

Published: Nov. 16, 2022

The Integrated Microbial Genomes & Microbiomes system (IMG/M: https://img.jgi.doe.gov/m/) at the Department of Energy (DOE) Joint Genome Institute (JGI) continues to provide support for users perform comparative analysis isolate and single cell genomes, metagenomes, metatranscriptomes. In addition datasets produced by JGI, IMG v.7 also includes imported from public sources such as NCBI Genbank, SRA, DOE National Microbiome Data Collaborative (NMDC), or submitted external users. past couple years, we have continued our effort help user community improving annotation pipeline, upgrading contents with new reference database versions, adding functionalities advanced scaffold search, Average Nucleotide Identity (ANI) high-quality metagenome bins, cassette improved gene neighborhood display, improvements metatranscriptome data display analysis. We extended collaboration integration efforts other DOE-funded projects NMDC Biology Knowledgebase (KBase).

Language: Английский

Citations

222

Expansion of the global RNA virome reveals diverse clades of bacteriophages DOI Creative Commons
Uri Neri, Yuri I. Wolf, Simon Roux

et al.

Cell, Journal Year: 2022, Volume and Issue: 185(21), P. 4023 - 4037.e18

Published: Sept. 28, 2022

High-throughput RNA sequencing offers broad opportunities to explore the Earth virome. Mining 5,150 diverse metatranscriptomes uncovered >2.5 million virus contigs. Analysis of >330,000 RNA-dependent polymerases (RdRPs) shows that this expansion corresponds a 5-fold increase known diversity. Gene content analysis revealed multiple protein domains previously not found in viruses and implicated virus-host interactions. Extended RdRP phylogeny supports monophyly five established phyla reveals two putative additional bacteriophage numerous classes orders. The dramatically expanded phylum Lenarviricota, consisting bacterial related eukaryotic viruses, now accounts for third Identification CRISPR spacer matches bacteriolytic proteins suggests subsets picobirnaviruses partitiviruses, associated with eukaryotes, infect prokaryotic hosts.

Language: Английский

Citations

205

Gut microbiota composition is associated with SARS-CoV-2 vaccine immunogenicity and adverse events DOI Creative Commons
Siew C. Ng, Ye Peng, Lin Zhang

et al.

Gut, Journal Year: 2022, Volume and Issue: 71(6), P. 1106 - 1116

Published: Feb. 9, 2022

Objective The gut microbiota plays a key role in modulating host immune response. We conducted prospective, observational study to examine composition association with responses and adverse events adults who have received the inactivated vaccine (CoronaVac; Sinovac) or mRNA (BNT162b2; BioNTech; Comirnaty). Design performed shotgun metagenomic sequencing stool samples of 138 COVID-19 vaccinees (37 CoronaVac 101 BNT162b2 vaccinees) collected at baseline 1 month after second dose vaccination. Immune markers were measured by SARS-CoV-2 surrogate virus neutralisation test spike receptor-binding domain IgG ELISA. Results found significantly lower response recipients than vaccines (p<0.05). Bifidobacterium adolescentis was persistently higher subjects high neutralising antibodies (p=0.023) their microbiome enriched pathways related carbohydrate metabolism (linear discriminant analysis (LDA) scores >2 p<0.05). Neutralising showed positive correlation total abundance bacteria flagella fimbriae including Roseburia faecis (p=0.028). Prevotella copri two Megamonas species individuals fewer following either indicating that these may play an anti-inflammatory (LDA scores>3 Conclusion Our has identified specific improved reduced vaccines. Microbiota-targeted interventions potential complement effectiveness

Language: Английский

Citations

142

Unraveling the functional dark matter through global metagenomics DOI Creative Commons
Georgios A. Pavlopoulos, Fotis A. Baltoumas, Sirui Liu

et al.

Nature, Journal Year: 2023, Volume and Issue: 622(7983), P. 594 - 602

Published: Oct. 11, 2023

Metagenomes encode an enormous diversity of proteins, reflecting a multiplicity functions and activities

Language: Английский

Citations

87