Deep learning methods for protein structure prediction DOI Creative Commons
Yiming Qin, Zihan Chen, Peng Ye

et al.

MedComm – Future Medicine, Journal Year: 2024, Volume and Issue: 3(3)

Published: Sept. 1, 2024

Abstract Protein structure prediction (PSP) has been a prominent topic in bioinformatics and computational biology, aiming to predict protein function from sequence data. The three‐dimensional conformation of proteins is pivotal for their intricate biological roles. With the advancement capabilities adoption deep learning (DL) technologies (especially Transformer network architectures), PSP field ushered brand‐new era “neuralization.” Here, we focus on reviewing evolution traditional modern learning‐based approaches characteristics various structural methods. This emphasizes advantages hybrid methods over approaches. study also provides summary analysis widely used databases latest models. It discusses networks algorithmic optimization model training, validation, evaluation. In addition, discussion major advances presented. update AlphaFold 3 further extends boundaries models, especially protein‐small molecule prediction. marks key shift toward holistic approach biomolecular elucidation, at solving almost all sequence‐to‐structure puzzles phenomena.

Language: Английский

Macromolecular Crowding, Phase Separation, and Homeostasis in the Orchestration of Bacterial Cellular Functions DOI Creative Commons
Begoña Monterroso, William Margolin, Arnold J. Boersma

et al.

Chemical Reviews, Journal Year: 2024, Volume and Issue: 124(4), P. 1899 - 1949

Published: Feb. 8, 2024

Macromolecular crowding affects the activity of proteins and functional macromolecular complexes in all cells, including bacteria. Crowding, together with physicochemical parameters such as pH, ionic strength, energy status, influences structure cytoplasm thereby indirectly function. Notably, also promotes formation biomolecular condensates by phase separation, initially identified eukaryotic cells but more recently discovered to play key functions Bacterial require a variety mechanisms maintain homeostasis, particular environments fluctuating conditions, is emerging one mechanism. In this work, we connect homeostasis function bacterial cell compare supramolecular structures found bacteria those cells. We focus on effects separation control chromosome replication, segregation, division, discuss contribution fitness adaptation environmental stress.

Language: Английский

Citations

34

Predictomes: A classifier-curated database of AlphaFold-modeled protein-protein interactions DOI Creative Commons
E. Schmid, Johannes C. Walter

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: April 12, 2024

Protein-protein interactions (PPIs) are ubiquitous in biology, yet a comprehensive structural characterization of the PPIs underlying biochemical processes is lacking. Although AlphaFold-Multimer (AF-M) has potential to fill this knowledge gap, standard AF-M confidence metrics do not reliably separate relevant from an abundance false positive predictions. To address limitation, we used machine learning on well curated datasets train Structure Prediction and Omics informed Classifier called SPOC that shows excellent performance separating true PPIs, including proteome-wide screens. We applied all-by-all matrix nearly 300 human genome maintenance proteins, generating ~40,000 predictions can be viewed at predictomes.org, where users also score their own with SPOC. High discovered using our approach suggest novel hypotheses maintenance. Our results provide framework for interpreting large scale screens help lay foundation interactome.

Language: Английский

Citations

25

Democratizing protein language models with parameter-efficient fine-tuning DOI Creative Commons
Samuel Sledzieski, Meghana Kshirsagar, Minkyung Baek

et al.

Proceedings of the National Academy of Sciences, Journal Year: 2024, Volume and Issue: 121(26)

Published: June 20, 2024

Proteomics has been revolutionized by large protein language models (PLMs), which learn unsupervised representations from corpora of sequences. These are typically fine-tuned in a supervised setting to adapt the model specific downstream tasks. However, computational and memory footprint fine-tuning (FT) PLMs presents barrier for many research groups with limited resources. Natural processing seen similar explosion size models, where these challenges have addressed methods parameter-efficient (PEFT). In this work, we introduce paradigm proteomics through leveraging method LoRA training new two important tasks: predicting protein–protein interactions (PPIs) symmetry homooligomer quaternary structures. We show that approaches competitive traditional FT while requiring reduced substantially fewer parameters. additionally PPI prediction task, only classification head also remains full FT, using five orders magnitude parameters, each outperform state-of-the-art compute. further perform comprehensive evaluation hyperparameter space, demonstrate PEFT is robust variations hyperparameters, elucidate best practices differ those natural processing. All our adaptation code available open-source at https://github.com/microsoft/peft_proteomics . Thus, provide blueprint democratize power PLM

Language: Английский

Citations

22

Predictomes, a classifier-curated database of AlphaFold-modeled protein-protein interactions DOI Creative Commons
E. Schmid, Johannes C. Walter

Molecular Cell, Journal Year: 2025, Volume and Issue: unknown

Published: Feb. 1, 2025

Protein-protein interactions (PPIs) are ubiquitous in biology, yet a comprehensive structural characterization of the PPIs underlying cellular processes is lacking. AlphaFold-Multimer (AF-M) has potential to fill this knowledge gap, but standard AF-M confidence metrics do not reliably separate relevant from an abundance false positive predictions. To address limitation, we used machine learning on curated datasets train structure prediction and omics-informed classifier (SPOC) that effectively separates true predictions PPIs, including proteome-wide screens. We applied SPOC all-by-all matrix nearly 300 human genome maintenance proteins, generating ∼40,000 can be viewed at predictomes.org, where users also score their own with SPOC. High-confidence discovered using our approach enable hypothesis generation maintenance. Our results provide framework for interpreting large-scale screens help lay foundation interactome.

Language: Английский

Citations

7

Rapid and sensitive protein complex alignment with Foldseek-Multimer DOI Creative Commons
Woosub Kim, Milot Mirdita, Eli Levy Karin

et al.

Nature Methods, Journal Year: 2025, Volume and Issue: unknown

Published: Feb. 5, 2025

Abstract Advances in computational structure prediction will vastly augment the hundreds of thousands currently available protein complex structures. Translating these into discoveries requires aligning them, which is computationally prohibitive. Foldseek-Multimer computes alignments from compatible chain-to-chain alignments, identified by efficiently clustering their superposition vectors. 3–4 orders magnitudes faster than gold standard, while producing comparable alignments; this allows it to compare billions pairs 11 h. open-source software at GitHub via https://github.com/steineggerlab/foldseek/ , https://search.foldseek.com/search/ and BFMD database.

Language: Английский

Citations

5

COCOMO2: A Coarse-Grained Model for Interacting Folded and Disordered Proteins DOI Creative Commons
Alexander Jussupow,

Divya Bartley,

Lisa J. Lapidus

et al.

Journal of Chemical Theory and Computation, Journal Year: 2025, Volume and Issue: unknown

Published: Feb. 5, 2025

Biomolecular interactions are essential in many biological processes, including complex formation and phase separation processes. Coarse-grained computational models especially valuable for studying such processes via simulation. Here, we present COCOMO2, an updated residue-based coarse-grained model that extends its applicability from intrinsically disordered peptides to folded proteins. This is accomplished with the introduction of a surface exposure scaling factor, which adjusts interaction strengths based on solvent accessibility, enable more realistic modeling involving domains without additional costs. COCOMO2 was parametrized directly solubility data improve performance predicting concentration-dependent broader range biomolecular systems compared original version. enables new applications study condensates involve IDPs together assembly also provides expanded foundation development multiscale approaches span residue-level atomistic resolution.

Language: Английский

Citations

4

Cryo-EM reveals mechanisms of natural RNA multivalency DOI
Liu Wang, Jiahao Xie, Tao Gong

et al.

Science, Journal Year: 2025, Volume and Issue: unknown

Published: March 13, 2025

Homo-oligomerization of biological macromolecules leads to functional assemblies that are critical understanding various cellular processes. However, RNA quaternary structures have been rarely reported. Comparative genomics analysis has identified families containing hundreds sequences adopt conserved secondary and likely fold into complex three-dimensional (3D) structures. We use cryo-electron microscopy (cryo-EM) determine from four families, including ARRPOF OLE forming dimers, ROOL GOLLD hexameric, octameric dodecameric nanostructures, at 2.6 4.6 Å resolutions. These homo-oligomeric reveal a plethora structural motifs contribute multivalency, kissing loop, palindromic base-pairing, A-stacking, metal ion coordination, pseudoknot minor-groove interactions. results provide the molecular basis intermolecular interactions driving multivalency with potential relevance.

Language: Английский

Citations

4

Scalable protein design using optimization in a relaxed sequence space DOI
Christopher L. Frank, Ali Khoshouei,

Lara Fuβ

et al.

Science, Journal Year: 2024, Volume and Issue: 386(6720), P. 439 - 445

Published: Oct. 24, 2024

Machine learning (ML)–based design approaches have advanced the field of de novo protein design, with diffusion-based generative methods increasingly dominating pipelines. Here, we report a “hallucination”-based approach that functions in relaxed sequence space, enabling efficient high-quality backbones over multiple scales and broad scope application without need for any form retraining. We experimentally produced characterized more than 100 proteins. Three high-resolution crystal structures two cryo–electron microscopy density maps designed single-chain proteins comprising up to 1000 amino acids validate accuracy method. Our pipeline can also be used synthetic protein-protein interactions, as validated by set heterodimers. Relaxed optimization offers attractive performance respect designability, applicability different problems, scalability across sizes.

Language: Английский

Citations

15

Predicted mechanistic impacts of human protein missense variants DOI Creative Commons
Jürgen Jänes, Marc Müller, Senthil Selvaraj

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: May 29, 2024

Abstract Genome sequencing efforts have led to the discovery of tens millions protein missense variants found in human population with majority these having no annotated role and some likely contributing trait variation disease. Sequence-based artificial intelligence approaches become highly accurate at predicting that are detrimental function proteins but they do not inform on mechanisms disruption. Here we combined sequence structure-based methods perform proteome-wide prediction deleterious information their impact stability, protein-protein interactions small-molecule binding pockets. AlphaFold2 structures were used predict approximately 100,000 pockets stability changes for over 200 million variants. To interfaces nearly 500,000 complexes. We illustrate value mechanism-aware variant effect predictions study relation between abundance structural properties underlying trans quantitative loci (pQTLs). characterised distribution mechanistic impacts patients experimentally studied example disease linked FGFR1.

Language: Английский

Citations

11

Bias in, bias out – AlphaFold-Multimer and the structural complexity of protein interfaces DOI
Joelle Strom, Katja Luck

Current Opinion in Structural Biology, Journal Year: 2025, Volume and Issue: 91, P. 103002 - 103002

Published: Feb. 12, 2025

Language: Английский

Citations

2