Design strategies and recent development of bioactive modulators for glutamine transporters DOI
Xinying Cheng,

Yezhi Wang,

Guangyue Gong

и другие.

Drug Discovery Today, Год журнала: 2024, Номер 29(2), С. 103880 - 103880

Опубликована: Янв. 11, 2024

Язык: Английский

AlphaFold Protein Structure Database in 2024: providing structure coverage for over 214 million protein sequences DOI Creative Commons
Mihály Váradi,

Damian Bertoni,

Paulyna Magaña

и другие.

Nucleic Acids Research, Год журнала: 2023, Номер 52(D1), С. D368 - D375

Опубликована: Ноя. 2, 2023

The AlphaFold Database Protein Structure (AlphaFold DB, https://alphafold.ebi.ac.uk) has significantly impacted structural biology by amassing over 214 million predicted protein structures, expanding from the initial 300k structures released in 2021. Enabled groundbreaking AlphaFold2 artificial intelligence (AI) system, predictions archived DB have been integrated into primary data resources such as PDB, UniProt, Ensembl, InterPro and MobiDB. Our manuscript details subsequent enhancements archiving, covering successive releases encompassing model organisms, global health proteomes, Swiss-Prot integration, a host of curated datasets. We detail access mechanisms direct file via FTP to advanced queries using Google Cloud Public Datasets programmatic endpoints database. also discuss improvements services added since its release, including Predicted Aligned Error viewer, customisation options for 3D search engine DB.

Язык: Английский

Процитировано

663

Machine Learning-Guided Protein Engineering DOI Creative Commons
Petr Kouba, Pavel Kohout, Faraneh Haddadi

и другие.

ACS Catalysis, Год журнала: 2023, Номер 13(21), С. 13863 - 13895

Опубликована: Окт. 13, 2023

Recent progress in engineering highly promising biocatalysts has increasingly involved machine learning methods. These methods leverage existing experimental and simulation data to aid the discovery annotation of enzymes, as well suggesting beneficial mutations for improving known targets. The field protein is gathering steam, driven by recent success stories notable other areas. It already encompasses ambitious tasks such understanding predicting structure function, catalytic efficiency, enantioselectivity, dynamics, stability, solubility, aggregation, more. Nonetheless, still evolving, with many challenges overcome questions address. In this Perspective, we provide an overview ongoing trends domain, highlight case studies, examine current limitations learning-based We emphasize crucial importance thorough validation emerging models before their use rational design. present our opinions on fundamental problems outline potential directions future research.

Язык: Английский

Процитировано

98

Impact of AlphaFold on structure prediction of protein complexes: The CASP15‐CAPRI experiment DOI Creative Commons
Marc F. Lensink, Guillaume Brysbaert, Nessim Raouraoua

и другие.

Proteins Structure Function and Bioinformatics, Год журнала: 2023, Номер 91(12), С. 1658 - 1683

Опубликована: Окт. 31, 2023

Abstract We present the results for CAPRI Round 54, 5th joint CASP‐CAPRI protein assembly prediction challenge. The offered 37 targets, including 14 homodimers, 3 homo‐trimers, 13 heterodimers antibody–antigen complexes, and 7 large assemblies. On average ~70 CASP predictor groups, more than 20 automatics servers, submitted models each target. A total of 21 941 by these groups 15 scorer were evaluated using model quality measures DockQ score consolidating measures. performance was quantified a weighted based on number acceptable or higher group among their five best models. Results show substantial progress achieved across significant fraction 60+ participating groups. High‐quality produced about 40% targets compared to 8% two years earlier. This remarkable improvement is due wide use AlphaFold2 AlphaFold2‐Multimer software confidence metrics they provide. Notably, expanded sampling candidate solutions manipulating deep learning inference engines, enriching multiple sequence alignments, integration advanced modeling tools, enabled top performing exceed standard version used as yard stick. notwithstanding, remained poor complexes with antibodies nanobodies, where evolutionary relationships between binding partners are lacking, featuring conformational flexibility, clearly indicating that remains challenging problem.

Язык: Английский

Процитировано

52

Artificial intelligence-aided protein engineering: from topological data analysis to deep protein language models DOI Creative Commons
Yuchi Qiu, Guo‐Wei Wei

Briefings in Bioinformatics, Год журнала: 2023, Номер 24(5)

Опубликована: Авг. 14, 2023

Protein engineering is an emerging field in biotechnology that has the potential to revolutionize various areas, such as antibody design, drug discovery, food security, ecology, and more. However, mutational space involved too vast be handled through experimental means alone. Leveraging accumulative protein databases, machine learning (ML) models, particularly those based on natural language processing (NLP), have considerably expedited engineering. Moreover, advances topological data analysis (TDA) artificial intelligence-based structure prediction, AlphaFold2, made more powerful structure-based ML-assisted strategies possible. This review aims offer a comprehensive, systematic, indispensable set of methodological components, including TDA NLP, for facilitate their future development.

Язык: Английский

Процитировано

44

Stability Oracle: a structure-based graph-transformer framework for identifying stabilizing mutations DOI Creative Commons
Daniel J. Diaz, Chengyue Gong,

Jeffrey Ouyang-Zhang

и другие.

Nature Communications, Год журнала: 2024, Номер 15(1)

Опубликована: Июль 23, 2024

Engineering stabilized proteins is a fundamental challenge in the development of industrial and pharmaceutical biotechnologies. We present Stability Oracle: structure-based graph-transformer framework that achieves SOTA performance on accurately identifying thermodynamically stabilizing mutations. Our introduces several innovations to overcome well-known challenges data scarcity bias, generalization, computation time, such as: Thermodynamic Permutations for augmentation, structural amino acid embeddings model mutation with single structure, protein structure-specific attention-bias mechanism makes transformers viable alternative graph neural networks. provide training/test splits mitigate leakage ensure proper evaluation. Furthermore, examine our engineering contributions, we fine-tune ESM2 representations (Prostata-IFML) achieve sequence-based models. Notably, Oracle outperforms Prostata-IFML even though it was pretrained 2000X less has 548X parameters. establishes path fine-tuning virtually any phenotype, necessary task accelerating protein-based

Язык: Английский

Процитировано

28

Random, de novo, and conserved proteins: How structure and disorder predictors perform differently DOI Creative Commons
Lasse Middendorf, Lars A. Eicholt

Proteins Structure Function and Bioinformatics, Год журнала: 2024, Номер 92(6), С. 757 - 767

Опубликована: Янв. 16, 2024

Abstract Understanding the emergence and structural characteristics of de novo random proteins is crucial for unraveling protein evolution designing novel enzymes. However, experimental determination their structures remains challenging. Recent advancements in structure prediction, particularly with AlphaFold2 (AF2), have expanded our knowledge structures, but applicability to unclear. In this study, we investigate predictions confidence scores AF2 language model‐based predictor ESMFold conserved from Drosophila a dataset comparable proteins. We find that differ significantly Interestingly, positive correlation between disorder (pLDDT) observed proteins, contrast negative Furthermore, performance predictors hampered by lack sequence identity. also observe fluctuating median predicted among different length quartiles suggesting an influence on predictions. conclusion, while provide initial insights into composition accuracy such remain limited. Experimental necessary comprehensive understanding. The pLDDT could imply potential conditional folding transient binding interactions

Язык: Английский

Процитировано

13

AI for organic and polymer synthesis DOI

Hong Xin,

Qi Yang, Kuangbiao Liao

и другие.

Science China Chemistry, Год журнала: 2024, Номер 67(8), С. 2461 - 2496

Опубликована: Июнь 26, 2024

Язык: Английский

Процитировано

12

State-of-the-Art in the Drug Discovery Pathway for Chagas Disease: A Framework for Drug Development and Target Validation DOI Creative Commons
Juan Carlos Gabaldón-Figueira, Nieves Martínez-Peinado, Elisa Escabia

и другие.

Research and Reports in Tropical Medicine, Год журнала: 2023, Номер Volume 14, С. 1 - 19

Опубликована: Июнь 1, 2023

Abstract: Chagas disease is the most important protozoan infection in Americas, and constitutes a significant public health concern throughout world. Development of new medications against its etiologic agent, Trypanosoma cruzi , has been traditionally slow difficult, lagging comparison with diseases caused by other kinetoplastid parasites. Among factors that explain this are incompletely understood mechanisms pathogenesis T. complex set interactions host chronic stage disease. These demand performance variety vitro vivo assays as part any drug development effort. In review, we discuss recent breakthroughs understanding parasite's life cycle their implications search for chemotherapeutics. For this, present framework to guide discovery efforts disease, considering state-of-the-art preclinical models recently developed tools identification validation molecular targets. Keywords: development, screenings, target, animal

Язык: Английский

Процитировано

20

Machine learning-aided design and screening of an emergent protein function in synthetic cells DOI Creative Commons
Shunshi Kohyama, Béla P. Frohn, Leon Babl

и другие.

Nature Communications, Год журнала: 2024, Номер 15(1)

Опубликована: Март 5, 2024

Abstract Recently, utilization of Machine Learning (ML) has led to astonishing progress in computational protein design, bringing into reach the targeted engineering proteins for industrial and biomedical applications. However, design emergent functions core relevance cells, such as ability spatiotemporally self-organize thereby structure cellular space, is still extremely challenging. While on generative side conditional models multi-state are rise, there a lack tailored screening methods typically needed project, both experimental. Here we describe proof-of-principle how screening, silico vitro, can be achieved ML-generated variants that forms intracellular spatiotemporal patterns. For use structure-based divide-and-conquer approach find most promising candidates, while subsequent vitro synthetic cell-mimics established by Bottom-Up Synthetic Biology. We then show best screened candidate indeed completely substitute wildtype gene Escherichia coli . These results raise great hopes next level biology, where ML-designed will used engineer functions.

Язык: Английский

Процитировано

9

An explainable language model for antibody specificity prediction using curated influenza hemagglutinin antibodies DOI
Yiquan Wang, Huibin Lv, Qi Wen Teo

и другие.

Immunity, Год журнала: 2024, Номер 57(10), С. 2453 - 2465.e7

Опубликована: Авг. 19, 2024

Язык: Английский

Процитировано

9