EnGens: a computational framework for generation and analysis of representative protein conformational ensembles DOI Creative Commons
Anja Conev, Maurício Rigo, Didier Devaurs

et al.

Briefings in Bioinformatics, Journal Year: 2023, Volume and Issue: 24(4)

Published: July 1, 2023

Proteins are dynamic macromolecules that perform vital functions in cells. A protein structure determines its function, but this is not static, as proteins change their conformation to achieve various functions. Understanding the conformational landscapes of essential understand mechanism action. Sets carefully chosen conformations can summarize such complex and provide better insights into function than single conformations. We refer these sets representative ensembles. Recent advances computational methods have led an increase number available structural datasets spanning landscapes. However, extracting ensembles from easy task many been developed tackle it. Our new approach, EnGens (short for ensemble generation), collects a unified framework generating analyzing In work, we: (1) overview existing tools generation analysis; (2) unify approaches open-source Python package, portable Docker image, providing interactive visualizations within Jupyter Notebook pipeline; (3) test our pipeline on few canonical examples literature. Representative produced by be used downstream tasks protein-ligand docking, Markov state modeling dynamics analysis effect single-point mutations.

Language: Английский

OpenFold: Retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalization DOI Creative Commons
Gustaf Ahdritz, Nazim Bouatta, Christina Floristean

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2022, Volume and Issue: unknown

Published: Nov. 22, 2022

Abstract AlphaFold2 revolutionized structural biology with the ability to predict protein structures exceptionally high accuracy. Its implementation, however, lacks code and data required train new models. These are necessary (i) tackle tasks, like protein-ligand complex structure prediction, (ii) investigate process by which model learns, remains poorly understood, (iii) assess model’s generalization capacity unseen regions of fold space. Here we report OpenFold, a fast, memory-efficient, trainable implementation AlphaFold2. We OpenFold from scratch, fully matching accuracy Having established parity, OpenFold’s generalize across space retraining it using carefully designed datasets. find that is remarkably robust at generalizing despite extreme reductions in training set size diversity, including near-complete elisions classes secondary elements. By analyzing intermediate produced during training, also gain surprising insights into manner learns proteins, discovering spatial dimensions learned sequentially. Taken together, our studies demonstrate power utility believe will prove be crucial resource for modeling community.

Language: Английский

Citations

127

Modeling conformational states of proteins with AlphaFold DOI Creative Commons
Davide Sala, Felipe Engelberger, Hassane S. Mchaourab

et al.

Current Opinion in Structural Biology, Journal Year: 2023, Volume and Issue: 81, P. 102645 - 102645

Published: June 29, 2023

Language: Английский

Citations

110

Protein structure prediction has reached the single-structure frontier DOI Open Access
Thomas J. Lane

Nature Methods, Journal Year: 2023, Volume and Issue: 20(2), P. 170 - 173

Published: Jan. 13, 2023

Language: Английский

Citations

109

Protein structure prediction with in-cell photo-crosslinking mass spectrometry and deep learning DOI Creative Commons
Kolja Stahl, Andrea Graziadei, Therese Dau

et al.

Nature Biotechnology, Journal Year: 2023, Volume and Issue: 41(12), P. 1810 - 1819

Published: March 20, 2023

While AlphaFold2 can predict accurate protein structures from the primary sequence, challenges remain for proteins that undergo conformational changes or which few homologous sequences are known. Here we introduce AlphaLink, a modified version of algorithm incorporates experimental distance restraint information into its network architecture. By employing sparse contacts as anchor points, AlphaLink improves on performance in predicting challenging targets. We confirm this experimentally by using noncanonical amino acid photo-leucine to obtain residue-residue inside cells crosslinking mass spectrometry. The program distinct conformations basis restraints provided, demonstrating value data driving structure prediction. noise-tolerant framework integrating prediction presented here opens path characterization in-cell data.

Language: Английский

Citations

95

Biasing AlphaFold2 to predict GPCRs and kinases with user-defined functional or structural properties DOI Creative Commons

Davide Sala,

Peter W. Hildebrand, Jens Meiler

et al.

Frontiers in Molecular Biosciences, Journal Year: 2023, Volume and Issue: 10

Published: Feb. 16, 2023

Determining the three-dimensional structure of proteins in their native functional states has been a longstanding challenge structural biology. While integrative biology most effective way to get high-accuracy different conformations and mechanistic insights for larger proteins, advances deep machine-learning algorithms have paved fully computational predictions. In this field, AlphaFold2 (AF2) pioneered

Language: Английский

Citations

56

Systematic identification of conditionally folded intrinsically disordered regions by AlphaFold2 DOI Creative Commons
T. Reid Alderson, Iva Pritišanac, Đesika Kolarić

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2022, Volume and Issue: unknown

Published: Feb. 18, 2022

Abstract The AlphaFold Protein Structure Database contains predicted structures for millions of proteins. For the majority human proteins that contain intrinsically disordered regions (IDRs), which do not adopt a stable structure, it is generally assumed these have low AlphaFold2 confidence scores reflect low-confidence structural predictions. Here, we show assigns confident to nearly 15% IDRs. By comparison experimental NMR data subset IDRs are known conditionally fold (i.e., upon binding or under other specific conditions), find often predicts structure folded state. Based on databases fold, estimate can identify folding at precision as high 88% 10% false positive rate, remarkable considering IDR were minimally represented in its training data. We disease mutations 5-fold enriched over general, and up 80% prokaryotes compared less than 20% eukaryotic These results indicate large proteomes eukaryotes function absence conditional folding, but acquire folds more sensitive mutations. emphasize predictions reveal functionally relevant plasticity within cannot offer realistic ensemble representations Significance Statement machine learning-based methods accurately predict most However, two-thirds segments highly flexible autonomously otherwise (IDRs). In interconvert rapidly between number different conformations, posing significant problem protein prediction define one small conformations. found readily certain conditions (conditional folding). leverage AlphaFold2’s quantify extent across tree life, rationalize disease-causing Classifications : Biological Sciences; Biophysics Computational Biology

Language: Английский

Citations

41

Improved multimer prediction using massive sampling with AlphaFold in CASP15 DOI Creative Commons
Björn Wallner

Proteins Structure Function and Bioinformatics, Journal Year: 2023, Volume and Issue: 91(12), P. 1734 - 1746

Published: Aug. 7, 2023

AlphaFold2 has revolutionized structure prediction by achieving high accuracy comparable to experimentally determined structures. However, there is still room for improvement, especially challenging cases like multimers. A key the success of AlphaFold its ability assess and rank own predictions. Our basic idea Wallner group in CASP15 was exploit this excellent scoring function massive sampling. To achieve goal, we conducted runs using six different settings, templates, without with an increased number recycles both multimer v1 v2 weights. In all instances, enabled dropout layers during inference, allowing sampling uncertainty enhancing diversity generated models. total, 274 289 models were 38 targets CASP15, a median 4810 per target. Of these targets, 10 quality, 11 medium acceptable, only 6 incorrect. The improvement over baseline method, NBIS-AF2-multimer, substantial, mean DockQ increasing from 0.43 0.56, several showing score increase +0.6 units. Remarkable, considering NBIS-AF2-multimer identical input data. can be attributed diversified settings and, particular, use v1, which much more susceptible compared v2. method available here: http://wallnerlab.org/AFsample/.

Language: Английский

Citations

38

AlphaFold2 and Deep Learning for Elucidating Enzyme Conformational Flexibility and Its Application for Design DOI Creative Commons
Guillem Casadevall, Cristina Duran, Sílvia Osuna

et al.

JACS Au, Journal Year: 2023, Volume and Issue: 3(6), P. 1554 - 1562

Published: June 6, 2023

The recent success of AlphaFold2 (AF2) and other deep learning (DL) tools in accurately predicting the folded three-dimensional (3D) structure proteins enzymes has revolutionized structural biology protein design fields. 3D indeed reveals key information on arrangement catalytic machinery which elements gate active site pocket. However, comprehending enzymatic activity requires a detailed knowledge chemical steps involved along cycle exploration multiple thermally accessible conformations that adopt when solution. In this Perspective, some studies showing potential AF2 elucidating conformational landscape are provided. Selected examples developments AF2-based DL methods for discussed, as well few enzyme cases. These show allowing routine computational efficient enzymes.

Language: Английский

Citations

34

Evolutionary selection of proteins with two folds DOI Creative Commons
Joseph W. Schafer, Lauren L. Porter

Nature Communications, Journal Year: 2023, Volume and Issue: 14(1)

Published: Sept. 6, 2023

Although most globular proteins fold into a single stable structure, an increasing number have been shown to remodel their secondary and tertiary structures in response cellular stimuli. State-of-the-art algorithms predict that these fold-switching adopt only one missing functionally critical alternative folds. Why is unclear, but all of them infer protein structure from coevolved amino acid pairs. Here, we hypothesize coevolutionary signatures are being missed. Suspecting single-fold variants could be masking signatures, developed approach, called Alternative Contact Enhancement (ACE), search both highly diverse superfamilies-composed variants-and subfamilies with more variants. ACE successfully revealed coevolution pairs uniquely corresponding conformations 56/56 distinct families. Then, used ACE-derived contacts (1) two experimentally consistent candidate unsolved (2) develop blind prediction pipeline for proteins. The discovery widespread dual-fold indicates sequences preserved by natural selection, implying functionalities provide evolutionary advantage paving the way predictions sequences.

Language: Английский

Citations

31

Computational drug development for membrane protein targets DOI
Haijian Li, Xiaolin Sun, Wenqiang Cui

et al.

Nature Biotechnology, Journal Year: 2024, Volume and Issue: 42(2), P. 229 - 242

Published: Feb. 1, 2024

Language: Английский

Citations

17