Combination of Whole Genome Sequencing and Metagenomics for Microbiological Diagnostics DOI Open Access
Srinithi Purushothaman, Marco Meola, Adrian Egli

et al.

International Journal of Molecular Sciences, Journal Year: 2022, Volume and Issue: 23(17), P. 9834 - 9834

Published: Aug. 30, 2022

Whole genome sequencing (WGS) provides the highest resolution for genome-based species identification and can provide insight into antimicrobial resistance virulence potential of a single microbiological isolate during diagnostic process. In contrast, metagenomic allows analysis DNA segments from multiple microorganisms within community, either using an amplicon- or shotgun-based approach. However, WGS shotgun data are rarely combined, although such approach may generate additive synergistic information, critical for, e.g., patient management, infection control, pathogen surveillance. To produce combined workflow with actionable outputs, we need to understand pre-to-post analytical process both technologies. This will require specific databases storing interlinked metadata, also involves customized bioinformatic pipelines. review article overview steps clinical application combining metagenomics together diagnosis.

Language: Английский

IMG/VR v4: an expanded database of uncultivated virus genomes within a framework of extensive functional, taxonomic, and ecological metadata DOI Creative Commons
Antônio Pedro Camargo, Stephen Nayfach, I-Min A. Chen

et al.

Nucleic Acids Research, Journal Year: 2022, Volume and Issue: 51(D1), P. D733 - D743

Published: Nov. 18, 2022

Viruses are widely recognized as critical members of all microbiomes. Metagenomics enables large-scale exploration the global virosphere, progressively revealing extensive genomic diversity viruses on Earth and highlighting myriad ways by which impact biological processes. IMG/VR provides access to largest collection viral sequences obtained from (meta)genomes, along with functional annotation rich metadata. A web interface users efficiently browse search based genome features and/or sequence similarity. Here, we present fourth version IMG/VR, composed >15 million virus genomes fragments, a ≈6-fold increase in size compared previous version. These clustered into 8.7 operational taxonomic units, including 231 408 at least one high-quality representative. Viral now systematically identified genomes, metagenomes, metatranscriptomes using new detection approach (geNomad), IMG standard complemented quality estimation CheckV, classification reflecting latest standards, microbial host taxonomy prediction. v4 is available https://img.jgi.doe.gov/vr, underlying data download https://genome.jgi.doe.gov/portal/IMG_VR.

Language: Английский

Citations

235

Expansion of the global RNA virome reveals diverse clades of bacteriophages DOI Creative Commons
Uri Neri, Yuri I. Wolf, Simon Roux

et al.

Cell, Journal Year: 2022, Volume and Issue: 185(21), P. 4023 - 4037.e18

Published: Sept. 28, 2022

High-throughput RNA sequencing offers broad opportunities to explore the Earth virome. Mining 5,150 diverse metatranscriptomes uncovered >2.5 million virus contigs. Analysis of >330,000 RNA-dependent polymerases (RdRPs) shows that this expansion corresponds a 5-fold increase known diversity. Gene content analysis revealed multiple protein domains previously not found in viruses and implicated virus-host interactions. Extended RdRP phylogeny supports monophyly five established phyla reveals two putative additional bacteriophage numerous classes orders. The dramatically expanded phylum Lenarviricota, consisting bacterial related eukaryotic viruses, now accounts for third Identification CRISPR spacer matches bacteriolytic proteins suggests subsets picobirnaviruses partitiviruses, associated with eukaryotes, infect prokaryotic hosts.

Language: Английский

Citations

203

Identification of mobile genetic elements with geNomad DOI Creative Commons
Antônio Pedro Camargo, Simon Roux, Frederik Schulz

et al.

Nature Biotechnology, Journal Year: 2023, Volume and Issue: 42(8), P. 1303 - 1312

Published: Sept. 21, 2023

Identifying and characterizing mobile genetic elements in sequencing data is essential for understanding their diversity, ecology, biotechnological applications impact on public health. Here we introduce geNomad, a classification annotation framework that combines information from gene content deep neural network to identify sequences of plasmids viruses. geNomad uses dataset more than 200,000 marker protein profiles provide functional taxonomic assignment viral genomes. Using conditional random field model, also detects proviruses integrated into host genomes with high precision. In benchmarks, achieved performance diverse viruses (Matthews correlation coefficient 77.8% 95.3%, respectively), substantially outperforming other tools. Leveraging geNomad's speed scalability, processed over 2.7 trillion base pairs data, leading the discovery millions are available through IMG/VR IMG/PR databases. at https://portal.nersc.gov/genomad .

Language: Английский

Citations

197

Database resources of the National Center for Biotechnology Information DOI Creative Commons
Eric W Sayers, Jeffrey Beck, Evan Bolton

et al.

Nucleic Acids Research, Journal Year: 2023, Volume and Issue: 52(D1), P. D33 - D43

Published: Nov. 22, 2023

Abstract The National Center for Biotechnology Information (NCBI) provides online information resources biology, including the GenBank® nucleic acid sequence database and PubMed® of citations abstracts published in life science journals. NCBI search retrieval operations most these data from 35 distinct databases. E-utilities serve as programming interface Resources receiving significant updates past year include PubMed, PMC, Bookshelf, SciENcv, NIH Comparative Genomics Resource (CGR), Virus, SRA, RefSeq, foreign contamination screening tools, Taxonomy, iCn3D, ClinVar, GTR, MedGen, dbSNP, ALFA, ClinicalTrials.gov, Pathogen Detection, antimicrobial resistance resources, PubChem. These can be accessed through home page at https://www.ncbi.nlm.nih.gov.

Language: Английский

Citations

133

Metagenomic Data Assembly – The Way of Decoding Unknown Microorganisms DOI Creative Commons
Alla Lapidus, Anton Korobeynikov

Frontiers in Microbiology, Journal Year: 2021, Volume and Issue: 12

Published: March 23, 2021

Metagenomics is a segment of conventional microbial genomics dedicated to the sequencing and analysis combined genomic DNA entire environmental samples. The most critical step metagenomic data reconstruction individual genes genomes microorganisms in communities using assemblers – computational programs that put together small fragments sequenced generated by instruments. Here, we describe challenges assembly, wide spectrum applications which assemblies were used better understand ecology evolution ecosystems, present one efficient assemblers, SPAdes was upgraded become applicable for metagenomics.

Language: Английский

Citations

104

Twenty-five years of Genomes OnLine Database (GOLD): data updates and new features in v.9 DOI Creative Commons
Supratim Mukherjee,

Dimitri Stamatis,

Cindy Tianqing Li

et al.

Nucleic Acids Research, Journal Year: 2022, Volume and Issue: 51(D1), P. D957 - D963

Published: Oct. 16, 2022

Abstract The Genomes OnLine Database (GOLD) (https://gold.jgi.doe.gov/) at the Department of Energy Joint Genome Institute (DOE-JGI) continues to maintain its role as one flagship genomic metadata repositories world. ever-increasing number projects and are freely available user community world-wide. GOLD’s is consumed by scientists remains an important source for large-scale comparative genomics analysis initiatives. Encouraged this active engagement growth, GOLD has continued add new components capabilities. features such a public Application Programming Interface (API) Ecosystem landing page well growth different entities in current v.9 edition described detail manuscript.

Language: Английский

Citations

73

RdRp-scan: A bioinformatic resource to identify and annotate divergent RNA viruses in metagenomic sequence data DOI Creative Commons
Justine Charon, Jan P. Buchmann, Sabrina Sadiq

et al.

Virus Evolution, Journal Year: 2022, Volume and Issue: 8(2)

Published: July 1, 2022

Despite a rapid expansion in the number of documented viruses following advent metagenomic sequencing, identification and annotation highly divergent RNA remain challenging, particularly from poorly characterized hosts environmental samples. Protein structures are more conserved than primary sequence data, such that structure-based comparisons provide an opportunity to reveal viral 'dusk matter': sequences with low, but detectable, levels identity known available protein structures. Here, we present new open computational resource-RdRp-scan-that contains standardized bioinformatic toolkit identify annotate data based on detection RNA-dependent polymerase (RdRp) sequences. By combining RdRp-specific hidden Markov models (HMMs) structural comparisons, show RdRp-scan can efficiently detect RdRp as low 10 per cent those not identifiable using standard sequence-to-sequence comparisons. In addition, facilitate placement newly detected virus-like into diversity viruses, provides custom curated databases core motifs, well pre-built multiple alignments. parallel, our analysis by revealed while most taxonomically unassigned RdRps fell pre-established clusters, some potentially orders related Wolframvirales Tolivirales. Finally, survey A, B, C motifs within database additional variations both position might insights structure, function, evolution polymerases.

Language: Английский

Citations

70

Mining metatranscriptomes reveals a vast world of viroid-like circular RNAs DOI Creative Commons
Benjamin D. Lee, Uri Neri, Simon Roux

et al.

Cell, Journal Year: 2023, Volume and Issue: 186(3), P. 646 - 661.e4

Published: Jan. 24, 2023

Viroids and viroid-like covalently closed circular (ccc) RNAs are minimal replicators that typically encode no proteins hijack cellular enzymes for replication. The extent diversity of agents poorly understood. We developed a computational pipeline to identify cccRNAs applied it 5,131 metatranscriptomes 1,344 plant transcriptomes. search yielded 11,378 spanning 4,409 species-level clusters, 5-fold increase compared the previously identified elements. Within this diverse collection, we discovered numerous putative viroids, satellite RNAs, retrozymes, ribozy-like viruses. Diverse ribozyme combinations unusual ribozymes within were identified. Self-cleaving in ambiviruses, some mito-like viruses capsid-encoding virus-like cccRNAs. broad presence transcriptomes ecosystems implies their host range is far broader than currently known, matches CRISPR spacers suggest replicate prokaryotes.

Language: Английский

Citations

63

ElasticBLAST: accelerating sequence search via cloud computing DOI Creative Commons
Christiam Camacho,

Grzegorz M. Boratyn,

Victor Joukov

et al.

BMC Bioinformatics, Journal Year: 2023, Volume and Issue: 24(1)

Published: March 26, 2023

Biomedical researchers use alignments produced by BLAST (Basic Local Alignment Search Tool) to categorize their query sequences. Producing such is an essential bioinformatics task that well suited for the cloud. The cloud can perform many calculations quickly as store and access large volumes of data. Bioinformaticians also it collaborate with other researchers, sharing results, datasets even pipelines on a common platform.We present ElasticBLAST, native application in ElasticBLAST handle anywhere from few thousands queries run searches virtual CPUs (if desired), deleting resources when done. It uses tools orchestration request discounted instances, lowering costs users. supported Amazon Web Services Google Cloud Platform. search databases are user provided or National Center Biotechnology Information.We show useful efficiently cloud, demonstrating two examples. At same time, hides much complexity working threshold move work

Language: Английский

Citations

62

Host traits shape virome composition and virus transmission in wild small mammals DOI Creative Commons
Yanmei Chen,

Shu-Jian Hu,

Xian‐Dan Lin

et al.

Cell, Journal Year: 2023, Volume and Issue: 186(21), P. 4662 - 4675.e12

Published: Sept. 20, 2023

Language: Английский

Citations

59