Topology in soft and biological matter DOI Creative Commons
Luca Tubiana, Gareth P. Alexander, Agnese Barbensi

et al.

Physics Reports, Journal Year: 2024, Volume and Issue: 1075, P. 1 - 137

Published: May 16, 2024

The last years have witnessed remarkable advances in our understanding of the emergence and consequences topological constraints biological soft matter. Examples are abundant relation to (bio)polymeric systems range from characterization knots single polymers proteins that whole chromosomes polymer melts. At same time, considerable been made description interplay between physical properties complex fluids, with development techniques now allow researchers control formation interaction defects diverse classes liquid crystals. Thanks technological progress integration experiments increasingly sophisticated numerical simulations, matter is a vibrant area research attracting scientists broad disciplines. However, owing high degree specialization modern science, many results remained confined their own particular fields, different jargon making it difficult for share ideas work together towards comprehensive view phenomena at play. Compelled by these motivations, here we present overview effects ranging DNA genome organization entangled proteins, polymeric materials, crystals, theoretical physics, intention reducing barriers fields biophysics. Particular care has taken providing coherent formal introduction continuum materials highlighting underlying common aspects concerning emergence, characterization, objects systems. second half review dedicated presentation latest selected problems, specifically, on viscoelastic materials; organization; discussion possible other entanglements proteins; solitons fluids. This memory Marek Cieplak.

Language: Английский

Evolutionary-scale prediction of atomic-level protein structure with a language model DOI Creative Commons
Zeming Lin, Halil Akin, Roshan Rao

et al.

Science, Journal Year: 2023, Volume and Issue: 379(6637), P. 1123 - 1130

Published: March 16, 2023

Recent advances in machine learning have leveraged evolutionary information multiple sequence alignments to predict protein structure. We demonstrate direct inference of full atomic-level structure from primary using a large language model. As models sequences are scaled up 15 billion parameters, an atomic-resolution picture emerges the learned representations. This results order-of-magnitude acceleration high-resolution prediction, which enables large-scale structural characterization metagenomic proteins. apply this capability construct ESM Metagenomic Atlas by predicting structures for >617 million sequences, including >225 that predicted with high confidence, gives view into vast breadth and diversity natural

Language: Английский

Citations

2210

A structural biology community assessment of AlphaFold2 applications DOI Creative Commons
Mehmet Akdel,

Douglas E. V. Pires,

Eduard Porta‐Pardo

et al.

Nature Structural & Molecular Biology, Journal Year: 2022, Volume and Issue: 29(11), P. 1056 - 1067

Published: Nov. 1, 2022

Most proteins fold into 3D structures that determine how they function and orchestrate the biological processes of cell. Recent developments in computational methods for protein structure predictions have reached accuracy experimentally determined models. Although this has been independently verified, implementation these across structural-biology applications remains to be tested. Here, we evaluate use AlphaFold2 (AF2) study characteristic structural elements; impact missense variants; ligand binding site predictions; modeling interactions; experimental data. For 11 proteomes, an average 25% additional residues can confidently modeled when compared with homology modeling, identifying features rarely seen Protein Data Bank. AF2-based disorder complexes surpass dedicated tools, AF2 models used diverse equally well structures, confidence metrics are critically considered. In summary, find advances likely a transformative biology broader life-science research.

Language: Английский

Citations

469

AlphaFold2 and its applications in the fields of biology and medicine DOI Creative Commons
Zhenyu Yang, Xiaoxi Zeng, Yi Zhao

et al.

Signal Transduction and Targeted Therapy, Journal Year: 2023, Volume and Issue: 8(1)

Published: March 14, 2023

Abstract AlphaFold2 (AF2) is an artificial intelligence (AI) system developed by DeepMind that can predict three-dimensional (3D) structures of proteins from amino acid sequences with atomic-level accuracy. Protein structure prediction one the most challenging problems in computational biology and chemistry, has puzzled scientists for 50 years. The advent AF2 presents unprecedented progress protein attracted much attention. Subsequent release more than 200 million predicted further aroused great enthusiasm science community, especially fields medicine. thought to have a significant impact on structural research areas need information, such as drug discovery, design, function, et al. Though time not long since was developed, there are already quite few application studies medicine, many them having preliminarily proved potential AF2. To better understand promote its applications, we will this article summarize principle architecture well recipe success, particularly focus reviewing applications Limitations current also be discussed.

Language: Английский

Citations

267

Fast, accurate antibody structure prediction from deep learning on massive set of natural antibodies DOI Creative Commons
Jeffrey A. Ruffolo, Lee‐Shin Chu, Sai Pooja Mahajan

et al.

Nature Communications, Journal Year: 2023, Volume and Issue: 14(1)

Published: April 25, 2023

Abstract Antibodies have the capacity to bind a diverse set of antigens, and they become critical therapeutics diagnostic molecules. The binding antibodies is facilitated by six hypervariable loops that are diversified through genetic recombination mutation. Even with recent advances, accurate structural prediction these remains challenge. Here, we present IgFold, fast deep learning method for antibody structure prediction. IgFold consists pre-trained language model trained on 558 million natural sequences followed graph networks directly predict backbone atom coordinates. predicts structures similar or better quality than alternative methods (including AlphaFold) in significantly less time (under 25 s). Accurate this timescale makes possible avenues investigation were previously infeasible. As demonstration IgFold’s capabilities, predicted 1.4 paired sequences, providing insights 500-fold more experimentally determined structures.

Language: Английский

Citations

172

Clustering predicted structures at the scale of the known protein universe DOI Creative Commons
Inigo Barrio‐Hernandez, Jingi Yeo, Jürgen Jänes

et al.

Nature, Journal Year: 2023, Volume and Issue: 622(7983), P. 637 - 645

Published: Sept. 13, 2023

Proteins are key to all cellular processes and their structure is important in understanding function evolution. Sequence-based predictions of protein structures have increased accuracy1, over 214 million predicted available the AlphaFold database2. However, studying at this scale requires highly efficient methods. Here, we developed a structural-alignment-based clustering algorithm-Foldseek cluster-that can cluster hundreds millions structures. Using method, clustered database, identifying 2.30 non-singleton structural clusters, which 31% lack annotations representing probable previously undescribed Clusters without annotation tend few representatives covering only 4% proteins database. Evolutionary analysis suggests that most clusters ancient origin but seem be species specific, lower-quality or examples de novo gene birth. We also show how comparisons used predict domain families relationships, remote similarity. On basis these analyses, identify several human immune-related with putative homology prokaryotic species, illustrating value resource for evolution across tree life.

Language: Английский

Citations

170

OpenFold: Retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalization DOI Creative Commons
Gustaf Ahdritz, Nazim Bouatta, Christina Floristean

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2022, Volume and Issue: unknown

Published: Nov. 22, 2022

Abstract AlphaFold2 revolutionized structural biology with the ability to predict protein structures exceptionally high accuracy. Its implementation, however, lacks code and data required train new models. These are necessary (i) tackle tasks, like protein-ligand complex structure prediction, (ii) investigate process by which model learns, remains poorly understood, (iii) assess model’s generalization capacity unseen regions of fold space. Here we report OpenFold, a fast, memory-efficient, trainable implementation AlphaFold2. We OpenFold from scratch, fully matching accuracy Having established parity, OpenFold’s generalize across space retraining it using carefully designed datasets. find that is remarkably robust at generalizing despite extreme reductions in training set size diversity, including near-complete elisions classes secondary elements. By analyzing intermediate produced during training, also gain surprising insights into manner learns proteins, discovering spatial dimensions learned sequentially. Taken together, our studies demonstrate power utility believe will prove be crucial resource for modeling community.

Language: Английский

Citations

125

OpenFold: retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalization DOI Creative Commons
Gustaf Ahdritz, Nazim Bouatta, Christina Floristean

et al.

Nature Methods, Journal Year: 2024, Volume and Issue: 21(8), P. 1514 - 1524

Published: May 14, 2024

AlphaFold2 revolutionized structural biology with the ability to predict protein structures exceptionally high accuracy. Its implementation, however, lacks code and data required train new models. These are necessary (1) tackle tasks, like protein–ligand complex structure prediction, (2) investigate process by which model learns (3) assess model's capacity generalize unseen regions of fold space. Here we report OpenFold, a fast, memory efficient trainable implementation AlphaFold2. We OpenFold from scratch, matching accuracy Having established parity, find that is remarkably robust at generalizing even when size diversity its training set deliberately limited, including near-complete elisions classes secondary elements. By analyzing intermediate produced during training, also gain insights into hierarchical manner in fold. In sum, our studies demonstrate power utility believe will prove be crucial resource for modeling community. open-source It fast efficient, available under permissive license.

Language: Английский

Citations

125

Modeling conformational states of proteins with AlphaFold DOI Creative Commons
Davide Sala, Felipe Engelberger, Hassane S. Mchaourab

et al.

Current Opinion in Structural Biology, Journal Year: 2023, Volume and Issue: 81, P. 102645 - 102645

Published: June 29, 2023

Language: Английский

Citations

110

Transformer-based deep learning for predicting protein properties in the life sciences DOI Creative Commons
Abel Chandra, Laura Tünnermann, Tommy Löfstedt

et al.

eLife, Journal Year: 2023, Volume and Issue: 12

Published: Jan. 18, 2023

Recent developments in deep learning, coupled with an increasing number of sequenced proteins, have led to a breakthrough life science applications, particular protein property prediction. There is hope that learning can close the gap between proteins and known properties based on lab experiments. Language models from field natural language processing gained popularity for predictions new computational revolution biology, where old prediction results are being improved regularly. Such learn useful multipurpose representations large open repositories sequences be used, instance, predict properties. The growing quickly because class model-the Transformer model. We review recent use large-scale applications predicting characteristics how such used predict, example, post-translational modifications. shortcomings other explain proven very promising way unravel information hidden amino acids.

Language: Английский

Citations

98

Computational and artificial intelligence-based methods for antibody development DOI Creative Commons
Ji‐Sun Kim, Matthew McFee,

Qiao Fang

et al.

Trends in Pharmacological Sciences, Journal Year: 2023, Volume and Issue: 44(3), P. 175 - 189

Published: Jan. 18, 2023

Due to their high target specificity and binding affinity, therapeutic antibodies are currently the largest class of biotherapeutics. The traditional largely empirical antibody development process is, while mature robust, cumbersome has significant limitations. Substantial recent advances in computational artificial intelligence (AI) technologies now starting overcome many these limitations increasingly integrated into pipelines. Here, we provide an overview AI methods relevant for development, including databases, predictors properties structure, design with emphasis on machine learning (ML) models, complementarity-determining region (CDR) loops, structural components critical binding.

Language: Английский

Citations

96