Large protein databases reveal structural complementarity and functional locality DOI Creative Commons
Paweł Szczerbiak, Lukasz M. Szydlowski, Witold Wydmański

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: Aug. 17, 2024

Abstract Recent breakthroughs in protein structure prediction have led to an unprecedented surge high-quality 3D models, highlighting the need for efficient computational solutions manage and analyze this wealth of structural data. In our work, we comprehensively examine clusters obtained from AlphaFold Protein Structure Database (AFDB), a subset ESMAtlas, Microbiome Immunity Project (MIP). We create single cohesive low-dimensional representation resulting space. Our results show that, while each database occupies distinct regions within space, they collectively exhibit significant overlap their functional profiles. High-level biological functions tend cluster particular regions, revealing shared landscape despite diverse sources By creating single, space integrating data sources, localizing annotations providing open-access web-server exploration, work offers insights future research concerning sequence-structure-function relationships, enabling various questions be asked about taxonomic assignments, environmental factors, or specificity. This approach is generalizable other datasets, further discovery beyond findings presented here.

Language: Английский

TIGR-Tas: A family of modular RNA-guided DNA-targeting systems in prokaryotes and their viruses DOI
Guilhem Faure, Makoto Saito, Max E. Wilkinson

et al.

Science, Journal Year: 2025, Volume and Issue: unknown

Published: Feb. 27, 2025

RNA-guided systems provide remarkable versatility, enabling diverse biological functions. Through iterative structural and sequence homology-based mining starting with a guide RNA-interaction domain of Cas9, we identified family DNA-targeting proteins in phage parasitic bacteria. Each system consists Tandem Interspaced Guide RNA (TIGR) array TIGR-associated (Tas) protein containing Nop domain, sometimes fused to HNH (TasH) or RuvC (TasR) nuclease domains. We show that TIGR arrays are processed into 36-nt RNAs (tigRNAs) direct sequence-specific DNA binding through tandem-spacer targeting mechanism. TasR can be reprogrammed for precise cleavage, including human cells. The structure reveals striking similarities box C/D snoRNPs IS110 transposases, providing insights the evolution systems.

Language: Английский

Citations

5

Exploring the diversity of anti-defense systems across prokaryotes, phages and mobile genetic elements DOI Creative Commons
Florian Tesson, Erin Huiting,

Linlin Wei

et al.

Nucleic Acids Research, Journal Year: 2024, Volume and Issue: 53(1)

Published: Dec. 9, 2024

Abstract The co-evolution of prokaryotes, phages and mobile genetic elements (MGEs) has driven the diversification defense anti-defense systems alike. Anti-defense proteins have diverse functional domains, sequences are typically small, creating a challenge to detect homologs across prokaryotic phage genomes. To date, no tools comprehensively annotate within desired sequence. Here, we developed ‘AntiDefenseFinder’—a free open-source tool web service that detects 156 one or more in any genomic Using this dataset, identified 47 981 distributed prokaryotes their viruses. We found some genes co-localize ‘anti-defense islands’, including Escherichia coli T4 Lambda phages, although many appear standalone. Eighty-nine per cent localize only preferentially MGE. However, >80% anti-Pycsar protein 1 (Apyc1) resides nonmobile regions bacterial Evolutionary analysis biochemical experiments revealed Apyc1 likely originated bacteria regulate cyclic nucleotide (cNMP) signaling, but co-opted overcome cNMP-utilizing defenses. With AntiDefenseFinder tool, hope facilitate identification full repertoire MGEs, discovery new functions deeper understanding host–pathogen arms race.

Language: Английский

Citations

7

The 2025 Nucleic Acids Research database issue and the online molecular biology database collection DOI Creative Commons
Daniel J. Rigden, Xosé M. Fernández

Nucleic Acids Research, Journal Year: 2024, Volume and Issue: 53(D1), P. D1 - D9

Published: Dec. 10, 2024

The 2025 Nucleic Acids Research database issue contains 185 papers spanning biology and related areas. Seventy three new databases are covered, while resources previously described in the account for 101 update articles. Databases most recently published elsewhere a further 11 papers. acid include EXPRESSO multi-omics of 3D genome structure (this issue's chosen Breakthrough Resource Article) NAIRDB Fourier transform infrared data. New protein predictions human isoforms at ASpdb viral proteins BFVD. UniProt, Pfam InterPro have all provided updates: metabolism signalling covered by descriptions STRING, KEGG CAZy, updated microbe-oriented Enterobase, VFDB PHI-base. Biomedical research is supported, among others, ClinVar, PubChem DrugMAP. Genomics-related Ensembl, UCSC Genome Browser dbSNP. plant cover Solanaceae (SolR) Asteraceae (AMIR) families an from NCBI Taxonomy also features. Database Issue freely available on website (https://academic.oup.com/nar). At NAR online Molecular Biology Collection (http://www.oxfordjournals.org/nar/database/c/), 932 entries been reviewed last year, 74 added 226 discontinued URLs eliminated bringing current total to 2236 databases.

Language: Английский

Citations

3

Structural basis for cooperative ssDNA binding by bacteriophage protein filament P12 DOI Creative Commons

Lothar Träger,

Morris Degen, Joana Pereira

et al.

Nucleic Acids Research, Journal Year: 2025, Volume and Issue: 53(5)

Published: Feb. 11, 2025

Protein-primed DNA replication is a unique mechanism, bioorthogonal to other known modes. It relies on specialised single-stranded (ssDNA)-binding proteins (SSBs) stabilise ssDNA intermediates by unknown mechanisms. Here, we present the structural and biochemical characterisation of P12, an SSB from bacteriophage PRD1. High-resolution cryo-electron microscopy reveals that P12 forms unique, cooperative filament along ssDNA. Each protomer binds phosphate backbone 6 nucleotides in sequence-independent manner, protecting nuclease degradation. Filament formation driven intrinsically disordered C-terminal tail, facilitating binding. We identify residues essential for interaction link ssDNA-binding ability toxicity host cells. Bioinformatic analyses place fold as distinct branch within OB-like family. This work offers new insights into protein-primed lays foundation biotechnological applications.

Language: Английский

Citations

0

Discovery of diverse and high-quality mRNA capping enzymes through a language model–enabled platform DOI
Theodore Wang,

Bowen R. Qin,

S J Li

et al.

Science Advances, Journal Year: 2025, Volume and Issue: 11(15)

Published: April 9, 2025

Mining and expanding high-quality genetic parts for synthetic biology bioengineering are urgent needs in the research development of next-generation biotechnology. However, gene mining has relied on sequence homology or ample expert knowledge, which fundamentally limits establishment a comprehensive part catalog. In this work, we propose SYMPLEX (synthetic biological platform by large language model–enabled knowledge extraction), universal gene-mining based models. We applied to mine enzymes responsible messenger RNA (mRNA) capping, key process eukaryotic posttranscriptional modification, obtained thousands diverse candidates with traceable evidence from biomedical literature databases. Of 46 experimentally tested integral capping enzyme candidates, 14 demonstrated vivo cross-species activity, 2 displayed superior vitro activity over commercial vaccinia currently used mRNA vaccine production. provides distinct paradigm functional offers powerful tools facilitate discovery fundamental research.

Language: Английский

Citations

0

Emerging frontiers in protein structure prediction following the AlphaFold revolution DOI Creative Commons
Martin L. Rennie, Michael R. Oliver

Journal of The Royal Society Interface, Journal Year: 2025, Volume and Issue: 22(225)

Published: April 1, 2025

Models of protein structures enable molecular understanding biological processes. Current structure prediction tools lie at the interface biology, chemistry and computer science. Millions models have been generated in a very short space time through revolution driven by deep learning, led AlphaFold. This has provided wealth new structural information. Interpreting these predictions is critical to determining where when this information useful. But proteins are not static nor do they act alone, interacting with other biomolecules complete their function level. review focuses on application state-of-the-art advanced applications. We also suggest set guidelines for reporting AlphaFold predictions.

Language: Английский

Citations

0

DSSP 4: FAIR annotation of protein secondary structure DOI Creative Commons
Maarten L. Hekkelman,

Daniel Álvarez Salmoral,

Anastassis Perrakis

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2025, Volume and Issue: unknown

Published: April 17, 2025

Abstract Protein secondary structure annotation is essential for understanding protein architecture, serving as a cornerstone structural classification, alignment, visualisation, and machine learning applications. The Define Secondary Structure of Proteins (DSSP) algorithm has long been the standard assigning elements such α-helices, β-sheets, loops in models. Here, we introduce DSSP version 4, which recapitulates functionality modern computational framework, extending also to detection left-handed κ-helices (Poly-Proline II helices). To align with FAIR principles (Findable, Accessible, Interoperable, Reusable), 4 adopts mmCIF its primary input output format, while retaining compatibility legacy PDB formats. We applied this updated tool analyse distribution across Data Bank (PDB) differentiating structures from diverse experimental methods, revealing insights into prevalence length elements, including newly annotated κ-helices. software, databank, server are freely accessible https://pdb-redo.eu/dssp , ensuring broad utility interoperability biology research.

Language: Английский

Citations

0

Structural biology of single-stranded, positive-sense RNA viruses in the age of accurate atomic-scale predictions of protein structures DOI Creative Commons
Stéphane Bressanelli, Sonia Fieulaine, Thibault Tubiana

et al.

Virology, Journal Year: 2025, Volume and Issue: 608, P. 110546 - 110546

Published: April 24, 2025

Language: Английский

Citations

0

Conserved Local Structural Motifs in Glycoside Hydrolase Families Facilitate the Discovery of Functional Enzymes DOI

Yupeng Liang,

Yalan Zhao,

Zhongwei Yin

et al.

Journal of Agricultural and Food Chemistry, Journal Year: 2025, Volume and Issue: unknown

Published: May 5, 2025

Glycoside hydrolases (GHs) are vital for natural glycoside biotransformation, especially in enhancing the pharmacological effects of products like ginsenosides. In this study, we collected 67 microbial-derived ginsenoside-hydrolyzing enzymes from nine GH families. Despite differences global structures, key residues surrounding substrate binding GH1 and GH3 exhibit conserved structural motifs. Leveraging these motifs, five genes Cellulosimicrobium were cloned, three (Cbgl496, Cbgl516, Cbgl766) characterized. Experimental results demonstrated that Cbgl766, Cbgl841 specifically catalyzed hydrolysis β(1-6) glycosidic bond C-20 sugar chain ginsenoside Rb1 to yield Rd. Cbgl496 selectively β(1-2) bonds oligosaccharide chains at C-3 position ginsenosides Rb1, Rb2, Rb3, Rc, thereby directionally producing minor Gy XVII, Compound O, Mx1, Mc1. Structural analysis 109,994 GH1/GH3 models AlphaFold database revealed across various organisms, emphasizing evolutionary conservation 3D structure catalytic core region despite sequence diversity. This study underscores importance local motifs GHs, offering insights functional enzyme screening understanding diversity industrial applications.

Language: Английский

Citations

0

Characterization of a multi-segmented rod-shaped mycovirus within the order Martellivirales largely accommodating plant viruses DOI Creative Commons

M. YOSHIOKA,

Akihito Fukudome,

Yuto Chiba

et al.

Virus Research, Journal Year: 2025, Volume and Issue: unknown, P. 199591 - 199591

Published: May 1, 2025

The order Martellivirales in the Riboviria realm includes seven established families. viruses this have a single-stranded positive-sense RNA genome and infect animals, plants, or fungi. In study, we characterized Aspergillus flavus vivivirus 1 (AfViV1), an virus infecting that presumably belongs to proposed "Viviviridae" family order. previous reports, multiple RNA-dependent polymerase (RdRP) sequences related were mainly identified from metatranscriptome data. However, their virological characteristics not disclosed. Our analysis showed AfViV1 virion exhibited rod-shaped structure with varying lengths coat protein (CP) encoded by RNA12 of AfViV1. Using AfViV1-CP sequence, detected several potential CP suggested based on sequence read archive (SRA) These data suggest similar structures. Interestingly, amino acid was significantly known viral CP. predicted Potyviridae (Patatavirales) Closteroviridae (Martellivirales) families (and orders). describes first multi-segmented fungal ssRNA particles expands morphological diversity viruses. Additionally, study highlights similarities between plant viruses, suggesting deep relationships concerning host range, adaptation, more.

Language: Английский

Citations

0