SFCalculator: connecting deep generative models and crystallography DOI Creative Commons
Minhuan Li, Kevin M. Dalton, Doeke R. Hekstra

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2025, Volume and Issue: unknown

Published: Jan. 15, 2025

Abstract Proteins drive biochemical transformations by transitioning through distinct conformational states. Understanding these states is essential for modulating protein function. Although X-ray crystallography has enabled revolutionary advances in structure prediction machine learning, this connection was made at the level of atomic models, not underlying data. This lack to crystallographic data limits potential further both accuracy and application learning experimental determination. Here, we present SFCalculator, a differentiable pipeline that generates observables from atomistic molecular structures with bulk solvent correction, bridging neural network-based modeling. We validate SFCalculator against conventional methods demonstrate its utility establishing three important proof-of-concept applications. First, enables accurate placement models relative crystal lattices (known as phasing). Second, search latent space generative conformations fit are, therefore, also implicitly constrained information encoded model. Finally, use during training enabling generate an ensemble consistent new generation analytical paradigms integrating learning.

Language: Английский

AlphaFold predictions are valuable hypotheses and accelerate but do not replace experimental structure determination DOI Creative Commons
Thomas C. Terwilliger, Dorothée Liebschner, Tristan I. Croll

et al.

Nature Methods, Journal Year: 2023, Volume and Issue: 21(1), P. 110 - 116

Published: Nov. 30, 2023

Abstract Artificial intelligence-based protein structure prediction methods such as AlphaFold have revolutionized structural biology. The accuracies of these predictions vary, however, and they do not take into account ligands, covalent modifications or other environmental factors. Here, we evaluate how well can be expected to describe the a by comparing directly with experimental crystallographic maps. In many cases, matched maps remarkably closely. even very high-confidence differed from on global scale through distortion domain orientation, local in backbone side-chain conformation. We suggest considering exceptionally useful hypotheses. further that it is important consider confidence when interpreting carry out determination verify details, particularly those involve interactions included prediction.

Language: Английский

Citations

170

Before and after AlphaFold2: An overview of protein structure prediction DOI Creative Commons
Letícia M. F. Bertoline,

Angélica N. Lima,

José Eduardo Krieger

et al.

Frontiers in Bioinformatics, Journal Year: 2023, Volume and Issue: 3

Published: Feb. 28, 2023

Three-dimensional protein structure is directly correlated with its function and determination critical to understanding biological processes addressing human health life science problems in general. Although new structures are experimentally obtained over time, there still a large difference between the number of sequences placed Uniprot those resolved tertiary structure. In this context, studies have emerged predict by methods based on template or free modeling. last years, different been combined overcome their individual limitations, until emergence AlphaFold2, which demonstrated that predicting high accuracy at unprecedented scale possible. Despite current impact field, AlphaFold2 has limitations. Recently, language models promised revolutionize structural biology allowing discovery only from evolutionary patterns present sequence. Even though these do not reach accuracy, they already covered some being able more than 200 million proteins metagenomic databases. mini-review, we provide an overview breakthroughs prediction before after emergence.

Language: Английский

Citations

163

Mechanisms and pathology of protein misfolding and aggregation DOI
Nikolaos Louros, Joost Schymkowitz, Frédéric Rousseau

et al.

Nature Reviews Molecular Cell Biology, Journal Year: 2023, Volume and Issue: 24(12), P. 912 - 933

Published: Sept. 8, 2023

Language: Английский

Citations

122

The power and pitfalls of AlphaFold2 for structure prediction beyond rigid globular proteins DOI
Vinayak Agarwal, Andrew C. McShan

Nature Chemical Biology, Journal Year: 2024, Volume and Issue: 20(8), P. 950 - 959

Published: June 21, 2024

Language: Английский

Citations

32

Mechanism of single-stranded DNA annealing by RAD52–RPA complex DOI Creative Commons

Chih-Chao Liang,

Luke A. Greenhough, Laura Masino

et al.

Nature, Journal Year: 2024, Volume and Issue: 629(8012), P. 697 - 703

Published: April 24, 2024

Abstract RAD52 is important for the repair of DNA double-stranded breaks 1,2 , mitotic synthesis 3–5 and alternative telomere length maintenance 6,7 . Central to these functions, promotes annealing complementary single-stranded (ssDNA) 8,9 provides an BRCA2/RAD51-dependent homologous recombination 10 Inactivation in homologous-recombination-deficient BRCA1 - or BRCA2 -defective cells synthetically lethal 11,12 aberrant expression associated with poor cancer prognosis 13,14 As a consequence, attractive therapeutic target against breast, ovarian prostate cancers 15–17 Here we describe structure define mechanism annealing. reported previously 18–20 forms undecameric (11-subunit) ring structures, but rings do not represent active form enzyme. Instead, cryo-electron microscopy biochemical analyses revealed that ssDNA driven by open association replication protein-A (RPA). Atomic models RAD52–ssDNA complex show sits positively charged channel around ring. Annealing N-terminal domains, whereas C-terminal regions modulate open-ring conformation RPA interaction. associates at site opening critical interactions occurring between RPA-interacting domain winged helix RPA2. Our studies provide structural snapshots throughout process molecular RAD52–RPA complex.

Language: Английский

Citations

23

MoDAFold: a strategy for predicting the structure of missense mutant protein based on AlphaFold2 and molecular dynamics DOI Creative Commons
Lingyan Zheng, Shuiyang Shi, Xiuna Sun

et al.

Briefings in Bioinformatics, Journal Year: 2024, Volume and Issue: 25(2)

Published: Jan. 22, 2024

Abstract Protein structure prediction is a longstanding issue crucial for identifying new drug targets and providing mechanistic understanding of protein functions. To enhance the progress in this field, spectrum computational methodologies has been cultivated. AlphaFold2 exhibited exceptional precision predicting wild-type structures, with performance exceeding that other methods. However, structures missense mutant proteins using remains challenging due to intricate substantial structural alterations caused by minor sequence variations proteins. Molecular dynamics (MD) validated precisely capturing changes amino acid interactions attributed mutations. Therefore, first time, strategy entitled ‘MoDAFold’ was proposed improve accuracy reliability combining MD. Multiple case studies have confirmed superior MoDAFold compared methods, particularly AlphaFold2.

Language: Английский

Citations

20

Accelerating crystal structure determination with iterative AlphaFold prediction DOI Creative Commons
Thomas C. Terwilliger, Pavel V. Afonine, Dorothée Liebschner

et al.

Acta Crystallographica Section D Structural Biology, Journal Year: 2023, Volume and Issue: 79(3), P. 234 - 244

Published: Feb. 15, 2023

Experimental structure determination can be accelerated with artificial intelligence (AI)-based structure-prediction methods such as AlphaFold . Here, an automatic procedure requiring only sequence information and crystallographic data is presented that uses predictions to produce electron-density map a structural model. Iterating through cycles of prediction key element this procedure: predicted model rebuilt in one cycle used template for the next cycle. This was applied X-ray 215 structures released by Protein Data Bank recent six-month period. In 87% cases our yielded at least 50% C α atoms matching those deposited models within 2 Å. Predictions from iterative template-guided were more accurate than obtained without templates. It concluded based on alone are usually enough solve phase problem molecular replacement, general strategy macromolecular includes AI-based both starting point method optimization suggested.

Language: Английский

Citations

40

Protein Structure Prediction: Challenges, Advances, and the Shift of Research Paradigms DOI Creative Commons
Bin Huang, Lupeng Kong, Chao Wang

et al.

Genomics Proteomics & Bioinformatics, Journal Year: 2023, Volume and Issue: 21(5), P. 913 - 925

Published: March 30, 2023

Abstract Protein structure prediction is an interdisciplinary research topic that has attracted researchers from multiple fields, including biochemistry, medicine, physics, mathematics, and computer science. These adopt various paradigms to attack the same problem: biochemists physicists attempt reveal principles governing protein folding; mathematicians, especially statisticians, usually start assuming a probability distribution of structures given target sequence then find most likely structure, while scientists formulate as optimization problem — finding structural conformation with lowest energy or minimizing difference between predicted native structure. fall into two statistical modeling cultures proposed by Leo Breiman, namely, data algorithmic modeling. Recently, we have also witnessed great success deep learning in prediction. In this review, present survey efforts for We compare adopted different emphasis on shift era learning. short, techniques, neural networks, considerably improved accuracy prediction; however, theories interpreting networks knowledge folding are still highly desired.

Language: Английский

Citations

35

MatGPT: A Vane of Materials Informatics from Past, Present, to Future DOI
Zhilong Wang, An Chen, Kehao Tao

et al.

Advanced Materials, Journal Year: 2023, Volume and Issue: 36(6)

Published: Oct. 10, 2023

Abstract Combining materials science, artificial intelligence (AI), physical chemistry, and other disciplines, informatics is continuously accelerating the vigorous development of new materials. The emergence “GPT (Generative Pre‐trained Transformer) AI” shows that scientific research field has entered era intelligent civilization with “data” as basic factor “algorithm + computing power” core productivity. continuous innovation AI will impact cognitive laws methods, reconstruct knowledge wisdom system. This leads to think more about informatics. Here, a comprehensive discussion models infrastructures provided, advances in discovery design are reviewed. With rise paradigms triggered by “AI for Science”, vane informatics: “MatGPT”, proposed technical path planning from aspects data, descriptors, generative models, pretraining directed collaborative training, experimental robots, well efforts preparations needed develop generation informatics, carried out. Finally, challenges constraints faced discussed, order achieve digital, intelligent, automated construction joint interdisciplinary scientists.

Language: Английский

Citations

33

Does AlphaFold2 model proteins’ intracellular conformations? An experimental test using cross-linking mass spectrometry of endogenous ciliary proteins DOI Creative Commons
Caitlyn L McCafferty, Erin L. Pennington, Ophelia Papoulas

et al.

Communications Biology, Journal Year: 2023, Volume and Issue: 6(1)

Published: April 15, 2023

A major goal in structural biology is to understand protein assemblies their biologically relevant states. Here, we investigate whether AlphaFold2 structure predictions match native conformations. We chemically cross-linked proteins situ within intact Tetrahymena thermophila cilia and ciliary extracts, identifying 1,225 intramolecular cross-links the 100 best-sampled proteins, providing a benchmark of distance restraints obeyed by assemblies. The corresponding were highly concordant, positioning 86.2% residues Cɑ-to-Cɑ distances 30 Å, consistent with cross-linker length. 43% showed no violations. Most inconsistencies occurred low-confidence regions or between domains. Overall, lower predicted aligned error corresponded more correct structures. However, observe cases where rigid body domains are oriented incorrectly, as for BBC118, suggesting that combining prediction experimental information will better reveal

Language: Английский

Citations

25