Diverse and abundant phages exploit conjugative plasmids DOI Creative Commons
Natalia Quinones‐Olvera, Siân V. Owen, Lucy M. McCully

et al.

Nature Communications, Journal Year: 2024, Volume and Issue: 15(1)

Published: April 12, 2024

Phages exert profound evolutionary pressure on bacteria by interacting with receptors the cell surface to initiate infection. While majority of phages use chromosomally encoded structures as receptors, plasmid-dependent exploit plasmid-encoded conjugation proteins, making their host range dependent horizontal transfer plasmid. Despite unique biology and biotechnological significance, only a small number have been characterized. Here we systematically search for new targeting IncP IncF plasmids using targeted discovery platform, find that they are common abundant in wastewater, largely unexplored terms genetic diversity. Plasmid-dependent enriched non-canonical types phages, all but one 65 isolated were non-tailed, members lipid-containing tectiviruses, ssDNA filamentous or ssRNA phages. We show tectiviruses exhibit differences which is associated variation phage holin protein. relatively high abundance missed metaviromic analyses, underscoring continued importance culture-based discovery. Finally, identify tailed plasmid, related structural genes orthogonal type 4 pilus receptor, highlighting evolutionarily promiscuous these distinct contractile multiple groups Taken together, results indicate play an under-appreciated role constraining gene via conjugative plasmids.

Language: Английский

UniProt: the Universal Protein Knowledgebase in 2023 DOI Creative Commons
Alex Bateman, María Martín, Sandra Orchard

et al.

Nucleic Acids Research, Journal Year: 2022, Volume and Issue: 51(D1), P. D523 - D531

Published: Nov. 21, 2022

Abstract The aim of the UniProt Knowledgebase is to provide users with a comprehensive, high-quality and freely accessible set protein sequences annotated functional information. In this publication we describe enhancements made our data processing pipeline website adapt an ever-increasing information content. number in UniProtKB has risen over 227 million are working towards including reference proteome for each taxonomic group. We continue extract detailed annotations from literature update or create reviewed entries, while unreviewed entries supplemented provided by automated systems using variety machine-learning techniques. addition, scientific community continues their contributions publications interest. Finally, new (https://www.uniprot.org/), designed enhance users’ experience make easily research community. This interface includes access AlphaFold structures more than 85% all as well improved visualisations subcellular localisation proteins.

Language: Английский

Citations

4345

KEGG for taxonomy-based analysis of pathways and genomes DOI Creative Commons
Minoru Kanehisa,

Miho Furumichi,

Yoko Sato

et al.

Nucleic Acids Research, Journal Year: 2022, Volume and Issue: 51(D1), P. D587 - D592

Published: Oct. 27, 2022

Abstract KEGG (https://www.kegg.jp) is a manually curated database resource integrating various biological objects categorized into systems, genomic, chemical and health information. Each object (database entry) identified by the identifier (kid), which generally takes form of prefix followed five-digit number, can be retrieved appending /entry/kid in URL. The pathway map viewer, Brite hierarchy viewer newly released genome browser launched /pathway/kid, /brite/kid /genome/kid, respectively, Together with an improved annotation procedure for KO (KEGG Orthology) assignment, increasing number eukaryotic genomes have been included better representation organisms taxonomic tree. Multiple taxonomy files are generated classification viruses, used mapping, variant mapping new Mapper suite. enables analysis of, example, how functional links genes physical on chromosome conserved among organism groups.

Language: Английский

Citations

3297

RCSB Protein Data Bank (RCSB.org): delivery of experimentally-determined PDB structures alongside one million computed structure models of proteins from artificial intelligence/machine learning DOI Creative Commons
S.K. Burley, Charmi Bhikadiya,

Chunxiao Bi

et al.

Nucleic Acids Research, Journal Year: 2022, Volume and Issue: 51(D1), P. D488 - D508

Published: Nov. 24, 2022

The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), founding member of the Worldwide (wwPDB), is US data center open-access PDB archive. As wwPDB-designated Archive Keeper, RCSB also responsible security. Annually, serves >10 000 depositors three-dimensional (3D) biostructures working on all permanently inhabited continents. delivers from its research-focused RCSB.org web portal to many millions consumers based in virtually every United Nations-recognized country, territory, etc. This Database Issue contribution describes upgrades that created a one-stop-shop open access ∼200 experimentally-determined structures biological macromolecules alongside >1 incorporated Computed Structure Models (CSMs) predicted using artificial intelligence/machine learning methods. 'living resource.' Every structure and CSM integrated weekly with related functional annotations external biodata resources, providing up-to-date information entire corpus 3D biostructure freely available no usage limitations. Within RCSB.org, CSMs are clearly identified as their provenance reliability. Both fully searchable, can be analyzed visualized full complement capabilities.

Language: Английский

Citations

495

Genomes on a Tree (GoaT): A versatile, scalable search engine for genomic and sequencing project metadata across the eukaryotic tree of life DOI Creative Commons
Richard Challis, Sujai Kumar, Cibele G. Sotero-Caio

et al.

Wellcome Open Research, Journal Year: 2023, Volume and Issue: 8, P. 24 - 24

Published: Jan. 17, 2023

As genomic data transform our understanding of biodiversity, the Earth BioGenome Project (EBP) has set a goal generating reference quality genome assemblies for all ~1.9 million described eukaryotic taxa. Meeting this requires coordination among many individual regional and taxon-focussed projects working under EBP umbrella. Large-scale sequencing require ready access to validated genome-relevant metadata, such as sizes karyotypes, but these are dispersed across literature, directly measured values lacking most To meet needs, we have developed Genomes on Tree (GoaT), an Elasticsearch-powered datastore search index metadata project plans statuses. GoaT indexes publicly available species interpolates missing through phylogenetic comparison. also holds target priority status information affiliated aid coordination. Metadata attributes in can be queried mature API, web front end, command line interface. The end additionally provides summary visualisations exploration reporting (see https://goat.genomehubs.org). currently direct or estimated over 70 taxon 30 assembly 1.5 species. depth breadth curated data, frequent updates, versatile query interface make powerful aggregator portal explore report underlying tree life. We illustrate utility series use cases from planning completion genome-sequencing project.

Language: Английский

Citations

336

The Sequence Read Archive: a decade more of explosive growth DOI Creative Commons
Kenneth Katz,

Oleg Shutov,

Richard T. Lapoint

et al.

Nucleic Acids Research, Journal Year: 2021, Volume and Issue: 50(D1), P. D387 - D390

Published: Oct. 19, 2021

Abstract The Sequence Read Archive (SRA, https://www.ncbi.nlm.nih.gov/sra/) stores raw sequencing data and alignment information to enhance reproducibility facilitate new discoveries through analysis. Here we note changes in storage designed increase access highlight analyses that augment metadata with taxonomic insight help users select data. In addition, present three unanticipated applications of

Language: Английский

Citations

243

TTD: Therapeutic Target Database describing target druggability information DOI Creative Commons
Ying Zhou, Yintao Zhang,

Donghai Zhao

et al.

Nucleic Acids Research, Journal Year: 2023, Volume and Issue: 52(D1), P. D1465 - D1477

Published: Sept. 15, 2023

Target discovery is one of the essential steps in modern drug development, and identification promising targets fundamental for developing first-in-class drug. A variety methods have emerged target assessment based on druggability analysis, which refers to likelihood a being effectively modulated by drug-like agents. In therapeutic database (TTD), nine categories established characteristics were thus collected 426 successful, 1014 clinical trial, 212 preclinical/patented, 1479 literature-reported via systematic review. These characteristic classified into three distinct perspectives: molecular interaction/regulation, human system profile cell-based expression variation. With rapid progression technology concerted effort discovery, TTD other databases highly expected facilitate explorations validation innovative target. now freely accessible at: https://idrblab.org/ttd/.

Language: Английский

Citations

234

AnimalTFDB 4.0: a comprehensive animal transcription factor database updated with variation and expression annotations DOI Creative Commons

Wen‐Kang Shen,

Si‐Yi Chen,

Zi-Quan Gan

et al.

Nucleic Acids Research, Journal Year: 2022, Volume and Issue: 51(D1), P. D39 - D45

Published: Oct. 5, 2022

Abstract Transcription factors (TFs) are proteins that interact with specific DNA sequences to regulate gene expression and play crucial roles in all kinds of biological processes. To keep up new data provide a more comprehensive resource for TF research, we updated the Animal Factor Database (AnimalTFDB) version 4.0 (http://bioinfo.life.hust.edu.cn/AnimalTFDB4/) up-to-date functions. We refined family rules prediction pipeline predict TFs genome-wide protein from Ensembl. As result, predicted 274 633 genes 150 726 transcription cofactor AnimalTFDB 183 animal genomes, which 86 species than 3.0. Besides double volume, also added following annotations functions database: (i) variations (including mutations) on various human cancers other diseases; (ii) post-translational modification sites phosphorylation, acetylation, methylation ubiquitination sites) 8 species; (iii) regulation autophagy; (iv) annotation 38 (v) exact batch search allow users flexibly. is useful studying regulation, contains classification cofactors.

Language: Английский

Citations

160

Update on the proposed minimal standards for the use of genome data for the taxonomy of prokaryotes DOI
Raúl Riesco, Martha E. Trujillo

INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY, Journal Year: 2024, Volume and Issue: 74(3)

Published: March 21, 2024

The field of microbial taxonomy is dynamic, aiming to provide a stable and contemporary classification system for prokaryotes. Traditionally, reliance on phenotypic characteristics limited the comprehensive understanding diversity evolution. introduction molecular techniques, particularly DNA sequencing genomics, has transformed our perception prokaryotic diversity. In past two decades, advancements in genome have transitioned from traditional methods genome-based taxonomic framework, not only define species, but also higher ranks. As technology databases rapidly expand, maintaining updated standards crucial. This work seeks revise 2018 guidelines applying data taxonomy, adapting minimal recommendations reflect technological progress during this period.

Language: Английский

Citations

160

Molecular and cellular evolution of the primate dorsolateral prefrontal cortex DOI
Shaojie Ma, Mario Škarica, Qian Li

et al.

Science, Journal Year: 2022, Volume and Issue: 377(6614)

Published: Aug. 25, 2022

The granular dorsolateral prefrontal cortex (dlPFC) is an evolutionary specialization of primates that centrally involved in cognition. We assessed more than 600,000 single-nucleus transcriptomes from adult human, chimpanzee, macaque, and marmoset dlPFC. Although most cell subtypes defined transcriptomically are conserved, we detected several exist only a subset species as well substantial species-specific molecular differences across homologous neuronal, glial, non-neural subtypes. latter exemplified by human-specific switching between expression the neuropeptide somatostatin tyrosine hydroxylase, rate-limiting enzyme dopamine production certain interneurons. above also illustrated neuropsychiatric risk gene

Language: Английский

Citations

155

CAMPR4: a database of natural and synthetic antimicrobial peptides DOI Creative Commons
Ulka Gawde,

Shuvechha Chakraborty,

Faiza Hanif Waghu

et al.

Nucleic Acids Research, Journal Year: 2022, Volume and Issue: 51(D1), P. D377 - D383

Published: Oct. 11, 2022

Abstract There has been an exponential increase in the design of synthetic antimicrobial peptides (AMPs) for its use as novel antibiotics. Synthetic AMPs are substantially enriched residues with physicochemical properties known to be critical activity; such positive charge, hydrophobicity, and higher alpha helical propensity. The current prediction algorithms have developed using AMP sequences from natural sources hence do not perform well peptides. In this version CAMP database, along updating sequence information AMPs, we created separate AMPs. CAMPR4 holds 24243 sequences, 933 structures, 2143 patents 263 family signatures. addition data on source organisms, target minimum inhibitory hemolytic concentrations, provides N C terminal modifications presence unusual amino acids, applicable. database is integrated tools rational (natural AMPs), (BLAST clustal omega), structure (VAST) analysis (PRATT, ScanProsite, CAMPSign). will aid enhance research. accessible at http://camp.bicnirrh.res.in/.

Language: Английский

Citations

141