The structural context of posttranslational modifications at a proteome-wide scale DOI Creative Commons
Isabell Bludau, Sander Willems, Wenfeng Zeng

et al.

PLoS Biology, Journal Year: 2022, Volume and Issue: 20(5), P. e3001636 - e3001636

Published: May 16, 2022

The recent revolution in computational protein structure prediction provides folding models for entire proteomes, which can now be integrated with large-scale experimental data. Mass spectrometry (MS)-based proteomics has identified and quantified tens of thousands posttranslational modifications (PTMs), most them uncertain functional relevance. In this study, we determine the structural context these PTMs investigate how information leveraged to pinpoint potential regulatory sites. Our analysis uncovers global patterns PTM occurrence across folded intrinsically disordered regions. We found that help distinguish from those marking improperly proteins. Interestingly, human proteome contains proteins have large domains linked by short, regions are strongly enriched phosphosites. These include well-known kinase activation loops induce conformational changes upon phosphorylation. This mechanism appears widespread kinases but also occurs other families such as solute carriers. It is not limited phosphorylation includes ubiquitination acetylation sites well. Furthermore, performed three-dimensional proximity analysis, revealed examples spatial coregulation different types crosstalk. To enable community build first analyses, provide tools 3D visualization data well python libraries accession processing.

Language: Английский

The STRING database in 2023: protein–protein association networks and functional enrichment analyses for any sequenced genome of interest DOI Creative Commons
Damian Szklarczyk,

Rebecca Kirsch,

Mikaela Koutrouli

et al.

Nucleic Acids Research, Journal Year: 2022, Volume and Issue: 51(D1), P. D638 - D646

Published: Nov. 12, 2022

Much of the complexity within cells arises from functional and regulatory interactions among proteins. The core these is increasingly known, but novel continue to be discovered, information remains scattered across different database resources, experimental modalities levels mechanistic detail. STRING (https://string-db.org/) systematically collects integrates protein-protein interactions-both physical as well associations. data originate a number sources: automated text mining scientific literature, computational interaction predictions co-expression, conserved genomic context, databases experiments known complexes/pathways curated sources. All are critically assessed, scored, subsequently automatically transferred less well-studied organisms using hierarchical orthology information. can accessed via website, also programmatically bulk downloads. most recent developments in (version 12.0) are: (i) it now possible create, browse analyze full network for any genome interest, by submitting its complement encoded proteins, (ii) co-expression channel uses variational auto-encoders predict interactions, covers two new sources, single-cell RNA-seq proteomics (iii) confidence each experimentally derived estimated based on detection method used, communicated user web-interface. Furthermore, continues enhance facilities enrichment analysis, which fully available user-submitted genomes.

Language: Английский

Citations

3858

RCSB Protein Data Bank (RCSB.org): delivery of experimentally-determined PDB structures alongside one million computed structure models of proteins from artificial intelligence/machine learning DOI Creative Commons
S.K. Burley, Charmi Bhikadiya,

Chunxiao Bi

et al.

Nucleic Acids Research, Journal Year: 2022, Volume and Issue: 51(D1), P. D488 - D508

Published: Nov. 24, 2022

The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), founding member of the Worldwide (wwPDB), is US data center open-access PDB archive. As wwPDB-designated Archive Keeper, RCSB also responsible security. Annually, serves >10 000 depositors three-dimensional (3D) biostructures working on all permanently inhabited continents. delivers from its research-focused RCSB.org web portal to many millions consumers based in virtually every United Nations-recognized country, territory, etc. This Database Issue contribution describes upgrades that created a one-stop-shop open access ∼200 experimentally-determined structures biological macromolecules alongside >1 incorporated Computed Structure Models (CSMs) predicted using artificial intelligence/machine learning methods. 'living resource.' Every structure and CSM integrated weekly with related functional annotations external biodata resources, providing up-to-date information entire corpus 3D biostructure freely available no usage limitations. Within RCSB.org, CSMs are clearly identified as their provenance reliability. Both fully searchable, can be analyzed visualized full complement capabilities.

Language: Английский

Citations

505

Targeting p53 pathways: mechanisms, structures and advances in therapy DOI Creative Commons

Haolan Wang,

Ming Guo,

Hudie Wei

et al.

Signal Transduction and Targeted Therapy, Journal Year: 2023, Volume and Issue: 8(1)

Published: March 1, 2023

The TP53 tumor suppressor is the most frequently altered gene in human cancers, and has been a major focus of oncology research. p53 protein transcription factor that can activate expression multiple target genes plays critical roles regulating cell cycle, apoptosis, genomic stability, widely regarded as "guardian genome". Accumulating evidence shown also regulates metabolism, ferroptosis, microenvironment, autophagy so on, all which contribute to suppression. Mutations not only impair its function, but confer oncogenic properties mutants. Since mutated inactivated malignant tumors, it very attractive for developing new anti-cancer drugs. However, until recently, was considered an "undruggable" little progress made with p53-targeted therapies. Here, we provide systematic review diverse molecular mechanisms signaling pathway how mutations impact progression. We discuss key structural features inactivation by mutations. In addition, efforts have therapies, challenges encountered clinical development.

Language: Английский

Citations

414

Scaffolding protein functional sites using deep learning DOI
Jue Wang, Sidney Lisanza, David Juergens

et al.

Science, Journal Year: 2022, Volume and Issue: 377(6604), P. 387 - 394

Published: July 21, 2022

The binding and catalytic functions of proteins are generally mediated by a small number functional residues held in place the overall protein structure. Here, we describe deep learning approaches for scaffolding such sites without needing to prespecify fold or secondary structure scaffold. first approach, "constrained hallucination," optimizes sequences that their predicted structures contain desired site. second "inpainting," starts from site fills additional sequence create viable scaffold single forward pass through specifically trained RoseTTAFold network. We use these two methods design candidate immunogens, receptor traps, metalloproteins, enzymes, protein-binding validate designs using combination silico experimental tests.

Language: Английский

Citations

325

AlphaFold2 and its applications in the fields of biology and medicine DOI Creative Commons
Zhenyu Yang, Xiaoxi Zeng, Yi Zhao

et al.

Signal Transduction and Targeted Therapy, Journal Year: 2023, Volume and Issue: 8(1)

Published: March 14, 2023

Abstract AlphaFold2 (AF2) is an artificial intelligence (AI) system developed by DeepMind that can predict three-dimensional (3D) structures of proteins from amino acid sequences with atomic-level accuracy. Protein structure prediction one the most challenging problems in computational biology and chemistry, has puzzled scientists for 50 years. The advent AF2 presents unprecedented progress protein attracted much attention. Subsequent release more than 200 million predicted further aroused great enthusiasm science community, especially fields medicine. thought to have a significant impact on structural research areas need information, such as drug discovery, design, function, et al. Though time not long since was developed, there are already quite few application studies medicine, many them having preliminarily proved potential AF2. To better understand promote its applications, we will this article summarize principle architecture well recipe success, particularly focus reviewing applications Limitations current also be discussed.

Language: Английский

Citations

267

BenchmarkingAlphaFoldfor protein complex modeling reveals accuracy determinants DOI Creative Commons
Rui Yin, Brandon Y. Feng, Amitabh Varshney

et al.

Protein Science, Journal Year: 2022, Volume and Issue: 31(8)

Published: July 13, 2022

High-resolution experimental structural determination of protein-protein interactions has led to valuable mechanistic insights, yet due the massive number and limitations there is a need for computational methods that can accurately model their structures. Here we explore use recently developed deep learning method, AlphaFold, predict structures protein complexes from sequence. With benchmark 152 diverse heterodimeric complexes, multiple implementations parameters AlphaFold were tested accuracy. Remarkably, many cases (43%) had near-native models (medium or high critical assessment predicted accuracy) generated as top-ranked predictions by greatly surpassing performance unbound docking (9% success rate models), however modeling antibody-antigen within our set was unsuccessful. We identified sequence features associated with lack success, also investigated impact alignment input. Benchmarking multimer-optimized version (AlphaFold-Multimer) released confirmed low (11% success), found T cell receptor-antigen are likewise not modeled algorithm, showing adaptive immune recognition poses challenge current algorithm model. Overall, study demonstrates end-to-end transient highlights areas improvement future developments reliably any interaction interest.

Language: Английский

Citations

261

Learning inverse folding from millions of predicted structures DOI Creative Commons
Chloe Hsu, Robert Verkuil, Jason Liu

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2022, Volume and Issue: unknown

Published: April 10, 2022

Abstract We consider the problem of predicting a protein sequence from its backbone atom coordinates. Machine learning approaches to this date have been limited by number available experimentally determined structures. augment training data nearly three orders magnitude structures for 12M sequences using AlphaFold2. Trained with additional data, sequence-to-sequence transformer invariant geometric input processing layers achieves 51% native recovery on structurally held-out backbones 72% buried residues, an overall improvement almost 10 percentage points over existing methods. The model generalizes variety more complex tasks including design complexes, partially masked structures, binding interfaces, and multiple states.

Language: Английский

Citations

249

AI-based structure prediction empowers integrative structural analysis of human nuclear pores DOI
Shyamal Mosalaganti, Agnieszka Obarska-Kosińska, Marc Siggel

et al.

Science, Journal Year: 2022, Volume and Issue: 376(6598)

Published: June 9, 2022

INTRODUCTION The eukaryotic nucleus pro-tects the genome and is enclosed by two membranes of nuclear envelope. Nuclear pore complexes (NPCs) perforate envelope to facilitate nucleocytoplasmic transport. With a molecular weight ∼120 MDa, human NPC one larg-est protein complexes. Its ~1000 proteins are taken in multiple copies from set about 30 distinct nucleoporins (NUPs). They can be roughly categorized into classes. Scaf-fold NUPs contain folded domains form cylindrical scaffold architecture around central channel. Intrinsically disordered line extend channel, where they interact with cargo highly dynamic. It responds changes tension conforma-tional breathing that manifests dilation constriction movements. Elucidating architecture, ultimately at atomic resolution, will important for gaining more precise understanding function dynamics but imposes substantial chal-lenge structural biologists. RATIONALE Considerable progress has been made toward this goal joint effort field. A synergistic combination complementary approaches turned out critical. In situ biology techniques were used reveal overall layout defines spatial reference modeling. High-resolution structures many determined vitro. Proteomic analysis extensive biochemical work unraveled interaction network NUPs. Integra-tive modeling combine different types data, resulting rough outline scaffold. Previous struc-tural models NPC, however, patchy limited accuracy owing several challenges: (i) Many high-resolution individual have solved distantly related species and, consequently, do not comprehensively cover their counterparts. (ii) scaf-fold interconnected intrinsically linker straight-forwardly accessible common techniques. (iii) intimately embraces fused inner outer distinctive topol-ogy cannot studied isolation. (iv) conformational limits resolution achievable structure determination. RESULTS study, we artificial intelligence (AI)-based prediction generate an exten-sive repertoire subcomplexes. various interfaces so far remained structurally uncharac-terized. Benchmarking against previous unpublished x-ray cryo-electron micros-copy revealed unprecedented accu-racy. We obtained well-resolved tomographic maps both constricted dilated states hu-man NPC. Using integrative modeling, fit-ted microscopy maps. explicitly included traced trajectory through scaf-fold. elucidated great detail how mem-brane-associated transmembrane distributed across fusion topology membranes. architectural model increases coverage twofold. extensively validated our earlier new experimental data. completeness enabled microsecond-long coarse-grained simulations within explicit membrane en-vironment solvent. These prevents otherwise stable double-membrane small diameters absence tension. CONCLUSION Our 70-MDa atomically re-solved covers >90% captures occur during constriction. also reveals anchoring sites NUPs, identification which prerequisite complete dy-namic study exempli-fies AI-based may accelerate elucidation subcellular ar-chitecture resolution. [Figure: see text].

Language: Английский

Citations

241

Protein structure predictions to atomic accuracy with AlphaFold DOI
John Jumper, Demis Hassabis

Nature Methods, Journal Year: 2022, Volume and Issue: 19(1), P. 11 - 12

Published: Jan. 1, 2022

Language: Английский

Citations

234

AF2Complex predicts direct physical interactions in multimeric proteins with deep learning DOI Creative Commons
Mu Gao, Davi Nakajima An, Jerry M. Parks

et al.

Nature Communications, Journal Year: 2022, Volume and Issue: 13(1)

Published: April 1, 2022

Accurate descriptions of protein-protein interactions are essential for understanding biological systems. Remarkably accurate atomic structures have been recently computed individual proteins by AlphaFold2 (AF2). Here, we demonstrate that the same neural network models from AF2 developed single protein sequences can be adapted to predict multimeric complexes without retraining. In contrast common approaches, our method, AF2Complex, does not require paired multiple sequence alignments. It achieves higher accuracy than some complex docking strategies and provides a significant improvement over AF-Multimer, development AlphaFold proteins. Moreover, introduce metrics predicting direct between arbitrary pairs validate AF2Complex on challenging benchmark sets E. coli proteome. Lastly, using cytochrome c biogenesis system I as an example, present high-confidence three sought-after assemblies formed eight members this system.

Language: Английский

Citations

224