What does it take for an ‘AlphaFold Moment’ in functional protein engineering and design? DOI
Roberto A. Chica, Noelia Ferruz

Nature Biotechnology, Год журнала: 2024, Номер 42(2), С. 173 - 174

Опубликована: Фев. 1, 2024

Язык: Английский

Machine Learning-Guided Protein Engineering DOI Creative Commons
Petr Kouba, Pavel Kohout, Faraneh Haddadi

и другие.

ACS Catalysis, Год журнала: 2023, Номер 13(21), С. 13863 - 13895

Опубликована: Окт. 13, 2023

Recent progress in engineering highly promising biocatalysts has increasingly involved machine learning methods. These methods leverage existing experimental and simulation data to aid the discovery annotation of enzymes, as well suggesting beneficial mutations for improving known targets. The field protein is gathering steam, driven by recent success stories notable other areas. It already encompasses ambitious tasks such understanding predicting structure function, catalytic efficiency, enantioselectivity, dynamics, stability, solubility, aggregation, more. Nonetheless, still evolving, with many challenges overcome questions address. In this Perspective, we provide an overview ongoing trends domain, highlight case studies, examine current limitations learning-based We emphasize crucial importance thorough validation emerging models before their use rational design. present our opinions on fundamental problems outline potential directions future research.

Язык: Английский

Процитировано

98

De novo protein design—From new structures to programmable functions DOI Creative Commons
Tanja Kortemme

Cell, Год журнала: 2024, Номер 187(3), С. 526 - 544

Опубликована: Фев. 1, 2024

Methods from artificial intelligence (AI) trained on large datasets of sequences and structures can now "write" proteins with new shapes molecular functions de novo, without starting found in nature. In this Perspective, I will discuss the state field novo protein design at juncture physics-based modeling approaches AI. New folds higher-order assemblies be designed considerable experimental success rates, difficult problems requiring tunable control over conformations precise shape complementarity for recognition are coming into reach. Emerging incorporate engineering principles-tunability, controllability, modularity-into process beginning. Exciting frontiers lie deconstructing cellular and, conversely, constructing synthetic signaling ground up. As methods improve, many more challenges unsolved.

Язык: Английский

Процитировано

97

Opportunities and Challenges for Machine Learning-Assisted Enzyme Engineering DOI Creative Commons
Jason Yang, Francesca-Zhoufan Li, Frances H. Arnold

и другие.

ACS Central Science, Год журнала: 2024, Номер 10(2), С. 226 - 241

Опубликована: Фев. 5, 2024

Enzymes can be engineered at the level of their amino acid sequences to optimize key properties such as expression, stability, substrate range, and catalytic efficiency-or even unlock new activities not found in nature. Because search space possible proteins is vast, enzyme engineering usually involves discovering an starting point that has some desired activity followed by directed evolution improve its "fitness" for a application. Recently, machine learning (ML) emerged powerful tool complement this empirical process. ML models contribute (1) discovery functional annotation known protein or generating novel with functions (2) navigating fitness landscapes optimization mappings between associated values. In Outlook, we explain how complements discuss future potential improved outcomes.

Язык: Английский

Процитировано

78

Bilingual Language Model for Protein Sequence and Structure DOI Creative Commons
Michael Heinzinger, Konstantin Weißenow, Joaquin Gomez Sanchez

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2023, Номер unknown

Опубликована: Июль 25, 2023

Abstract Adapting large language models (LLMs) to protein sequences spawned the development of powerful (pLMs). Concurrently, AlphaFold2 broke through in structure prediction. Now we can systematically and comprehensively explore dual nature proteins that act exist as three-dimensional (3D) machines evolve linear strings one-dimensional (1D) sequences. Here, leverage pLMs simultaneously model both modalities by combining 1D with 3D a single model. We encode structures token using 3Di-alphabet introduced 3D-alignment method Foldseek . This new foundation pLM extracts features patterns resulting “structure-sequence” representation. Toward this end, built non-redundant dataset from AlphaFoldDB fine-tuned an existing (ProtT5) translate between 3Di amino acid As proof-of-concept for our novel approach, dubbed Protein structure-sequence T5 ( ProstT5 ), showed improved performance subsequent prediction tasks, “inverse folding”, namely generation adopting given structural scaffold (“fold”). Our work showcased potential tap into information-rich revolution fueled AlphaFold2. paves way develop tools integrating vast resource predictions, opens research avenues post-AlphaFold2 era. is freely available all at https://github.com/mheinzinger/ProstT5

Язык: Английский

Процитировано

66

Protein generation with evolutionary diffusion: sequence is all you need DOI Creative Commons
Sarah Alamdari, Nitya Thakkar, Rianne van den Berg

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2023, Номер unknown

Опубликована: Сен. 12, 2023

Abstract Deep generative models are increasingly powerful tools for the in silico design of novel proteins. Recently, a family called diffusion has demonstrated ability to generate biologically plausible proteins that dissimilar any actual seen nature, enabling unprecedented capability and control de novo protein design. However, current state-of-the-art structures, which limits scope their training data restricts generations small biased subset space. Here, we introduce general-purpose framework, EvoDiff, combines evolutionary-scale with distinct conditioning capabilities controllable generation sequence EvoDiff generates high-fidelity, diverse, structurally-plausible cover natural functional We show experimentally express, fold, exhibit expected secondary structure elements. Critically, can inaccessible structure-based models, such as those disordered regions, while maintaining scaffolds structural motifs. validate universality our sequence-based formulation by characterizing intrinsically-disordered mitochondrial targeting signals, metal-binding proteins, binders designed using EvoDiff. envision will expand engineering beyond structure-function paradigm toward programmable, sequence-first

Язык: Английский

Процитировано

66

Bilingual language model for protein sequence and structure DOI Creative Commons
Michael Heinzinger, Konstantin Weißenow, Joaquin Gomez Sanchez

и другие.

NAR Genomics and Bioinformatics, Год журнала: 2024, Номер 6(4)

Опубликована: Сен. 28, 2024

Adapting language models to protein sequences spawned the development of powerful (pLMs). Concurrently, AlphaFold2 broke through in structure prediction. Now we can systematically and comprehensively explore dual nature proteins that act exist as three-dimensional (3D) machines evolve linear strings one-dimensional (1D) sequences. Here, leverage pLMs simultaneously model both modalities a single model. We encode structures token using 3Di-alphabet introduced by 3D-alignment method

Язык: Английский

Процитировано

31

Computational scoring and experimental evaluation of enzymes generated by neural networks DOI Creative Commons
Sean R. Johnson, Xiaozhi Fu, Sandra Viknander

и другие.

Nature Biotechnology, Год журнала: 2024, Номер unknown

Опубликована: Апрель 23, 2024

In recent years, generative protein sequence models have been developed to sample novel sequences. However, predicting whether generated proteins will fold and function remains challenging. We evaluate a set of 20 diverse computational metrics assess the quality enzyme sequences produced by three contrasting models: ancestral reconstruction, adversarial network language model. Focusing on two families, we expressed purified over 500 natural with 70-90% identity most similar benchmark for in vitro activity. Over rounds experiments, filter that improved rate experimental success 50-150%. The proposed drive engineering research serving as helping select active variants testing.

Язык: Английский

Процитировано

27

Advances in microbial exoenzymes bioengineering for improvement of bioplastics degradation DOI Creative Commons
Farzad Rahmati, Debadatta Sethi, Weixi Shu

и другие.

Chemosphere, Год журнала: 2024, Номер 355, С. 141749 - 141749

Опубликована: Март 21, 2024

Plastic pollution has become a major global concern, posing numerous challenges for the environment and wildlife. Most conventional ways of plastics degradation are inefficient cause great damage to ecosystems. The development biodegradable offers promising solution waste management. These designed break down under various conditions, opening up new possibilities mitigate negative impact traditional plastics. Microbes, including bacteria fungi, play crucial role in bioplastics by producing secreting extracellular enzymes, such as cutinase, lipases, proteases. However, these microbial enzymes sensitive extreme environmental temperature acidity, affecting their functions stability. To address challenges, scientists have employed protein engineering immobilization techniques enhance enzyme stability predict structures. Strategies improving substrate interaction, increasing thermostability, reinforcing bonding between active site substrate, refining activity being utilized boost functionality. Recently, bioengineering through gene cloning expression potential microorganisms, revolutionized biodegradation bioplastics. This review aimed discuss most recent strategies modifying bioplastic-degrading terms functionality, thermostability enhancement, binding site, with other improvement surface action. Additionally, discovered exoenzymes metagenomics were emphasized.

Язык: Английский

Процитировано

26

Atomic context-conditioned protein sequence design using LigandMPNN DOI Creative Commons
Justas Dauparas, Gyu Rie Lee, Robert Pecoraro

и другие.

Nature Methods, Год журнала: 2025, Номер unknown

Опубликована: Март 28, 2025

Protein sequence design in the context of small molecules, nucleotides and metals is critical to enzyme small-molecule binder sensor design, but current state-of-the-art deep-learning-based methods are unable model nonprotein atoms molecules. Here we describe a protein method called LigandMPNN that explicitly models all components biomolecular systems. significantly outperforms Rosetta ProteinMPNN on native backbone recovery for residues interacting with molecules (63.3% versus 50.4% 50.5%), (50.5% 35.2% 34.0%) (77.5% 36.0% 40.6%). generates not only sequences also sidechain conformations allow detailed evaluation binding interactions. has been used over 100 experimentally validated DNA-binding proteins high affinity structural accuracy (as indicated by four X-ray crystal structures), redesign designs increased as much 100-fold. We anticipate will be widely useful designing new proteins, sensors enzymes.

Язык: Английский

Процитировано

3

Accelerating Biocatalysis Discovery with Machine Learning: A Paradigm Shift in Enzyme Engineering, Discovery, and Design DOI Creative Commons

Braun Markus,

Gruber Christian C,

Krassnigg Andreas

и другие.

ACS Catalysis, Год журнала: 2023, Номер 13(21), С. 14454 - 14469

Опубликована: Окт. 26, 2023

Emerging computational tools promise to revolutionize protein engineering for biocatalytic applications and accelerate the development timelines previously needed optimize an enzyme its more efficient variant. For over a decade, benefits of predictive algorithms have helped scientists engineers navigate complexity functional sequence space. More recently, spurred by dramatic advances in underlying tools, faster, cheaper, accurate identification, characterization, has catapulted terms such as artificial intelligence machine learning must-have vocabulary field. This Perspective aims showcase current status pharmaceutical industry also discuss celebrate innovative approaches science highlighting their potential selected recent developments offering thoughts on future opportunities biocatalysis. It critically assesses technology's limitations, unanswered questions, unmet challenges.

Язык: Английский

Процитировано

39