Bacterial Metallostasis: Metal Sensing, Metalloproteome Remodeling, and Metal Trafficking DOI Creative Commons
Daiana A. Capdevila, Johnma J. Rondón, Katherine A. Edmonds

et al.

Chemical Reviews, Journal Year: 2024, Volume and Issue: unknown

Published: Dec. 10, 2024

Transition metals function as structural and catalytic cofactors for a large diversity of proteins enzymes that collectively comprise the metalloproteome. Metallostasis considers all cellular processes, notably metal sensing, metalloproteome remodeling, trafficking (or allocation) ensure functional integrity adaptability Bacteria employ both protein RNA-based mechanisms sense intracellular transition bioavailability orchestrate systems-level outputs maintain metallostasis. In this review, we contextualize metallostasis by briefly discussing specialized roles play in biology. We then offer comprehensive perspective on metalloregulatory metal-sensing riboswitches, defining general principles within each sensor superfamily capture how specificity is encoded sequence, selectivity can be leveraged downstream synthetic biology biotechnology applications. This followed discussion recent work highlights selected outputs, including remodeling allocation metallochaperones to client compartments. close places where more needed fill gaps our understanding

Language: Английский

Rapid in silico directed evolution by a protein language model with EVOLVEpro DOI
Kaiyi Jiang, Zhaoqing Yan, Matteo Di Bernardo

et al.

Science, Journal Year: 2024, Volume and Issue: unknown

Published: Nov. 21, 2024

Directed protein evolution is central to biomedical applications but faces challenges like experimental complexity, inefficient multi-property optimization, and local maxima traps. While in silico methods using language models (PLMs) can provide modeled fitness landscape guidance, they struggle generalize across diverse families map activity. We present EVOLVEpro, a few-shot active learning framework that combines PLMs regression rapidly improve EVOLVEpro surpasses current methods, yielding up 100-fold improvements desired properties. demonstrate its effectiveness six proteins RNA production, genome editing, antibody binding applications. These results highlight the advantages of with minimal data over zero-shot predictions. opens new possibilities for AI-guided engineering biology medicine.

Language: Английский

Citations

20

Machine Learning-Based Process Optimization in Biopolymer Manufacturing: A Review DOI Open Access
Ivan Malashin,

D. A. Martysyuk,

В С Тынченко

et al.

Polymers, Journal Year: 2024, Volume and Issue: 16(23), P. 3368 - 3368

Published: Nov. 29, 2024

The integration of machine learning (ML) into material manufacturing has driven advancements in optimizing biopolymer production processes. ML techniques, applied across various stages production, enable the analysis complex data generated throughout identifying patterns and insights not easily observed through traditional methods. As sustainable alternatives to petrochemical-based plastics, biopolymers present unique challenges due their reliance on variable bio-based feedstocks processing conditions. This review systematically summarizes current applications techniques aiming provide a comprehensive reference for future research while highlighting potential enhance efficiency, reduce costs, improve product quality. also shows role algorithms, including supervised, unsupervised, deep

Language: Английский

Citations

4

Self-supervised machine learning methods for protein design improve sampling but not the identification of high-fitness variants DOI Creative Commons
Moritz Ertelt, Rocco Moretti, Jens Meiler

et al.

Science Advances, Journal Year: 2025, Volume and Issue: 11(7)

Published: Feb. 12, 2025

Machine learning (ML) is changing the world of computational protein design, with data-driven methods surpassing biophysical-based in experimental success. However, they are most often reported as case studies, lack integration and standardization, therefore hard to objectively compare. In this study, we established a streamlined diverse toolbox for that predict amino acid probabilities inside Rosetta software framework allows side-by-side comparison these models. Subsequently, existing fitness landscapes were used benchmark novel ML realistic design settings. We focused on traditional problems design: sampling scoring. A major finding our study approaches better at purging space from deleterious mutations. Nevertheless, scoring resulting mutations without model fine-tuning showed no clear improvement over Rosetta. conclude now complements, rather than replaces, biophysical design.

Language: Английский

Citations

0

Understanding, inhibiting, and engineering membrane transporters with high-throughput mutational screens DOI
Steven T. Miller, Christian B. Macdonald, Srivatsan Raman

et al.

Cell chemical biology, Journal Year: 2025, Volume and Issue: unknown

Published: March 1, 2025

Language: Английский

Citations

0

SaprotHub: Making Protein Modeling Accessible to All Biologists DOI Open Access

Jin Su,

Zhikai Li, Chenchen Han

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: May 28, 2024

Abstract Training and deploying deep learning models pose challenges for users without machine (ML) expertise. SaprotHub offers a user-friendly platform that democratizes the process of training, utilizing, storing, sharing protein ML models, fostering collaboration within biology community—all requiring extensive At its core, Saprot is an advanced, foundational language model. Through ColabSaprot framework, it supports potentially hundreds training prediction applications, enabling co-construction co-sharing these trained models. This enhances user engagement drives community-wide innovation.

Language: Английский

Citations

3

SeqDance: A Protein Language Model for Representing Protein Dynamic Properties DOI Creative Commons
Chao Hou, Yufeng Shen

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: Oct. 15, 2024

Abstract Proteins perform their functions by folding amino acid sequences into dynamic structural ensembles. Despite the important role of protein dynamics, complexity and absence efficient representation methods have limited integration studies on function mutation fitness, especially in deep learning applications. To address this, we present SeqDance, a language model designed to learn properties directly from sequence alone. SeqDance is pre-trained biophysical derived over 30,400 molecular dynamics trajectories 28,600 normal mode analyses. Our results show that effectively captures local interactions, co-movement patterns, global conformational features, even for proteins lacking homologs pre-training set. Additionally, showed enhances prediction fitness landscapes, disorder-to-order transition binding regions, phase-separating proteins. By sequence, complements conventional evolution- static structure-based methods, offering new insights behavior function.

Language: Английский

Citations

1

Evaluation of Machine Learning-Assisted Directed Evolution Across Diverse Combinatorial Landscapes DOI Creative Commons
Francesca-Zhoufan Li, Jason Yang, Kadina E. Johnston

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: Oct. 24, 2024

Summary Various machine learning-assisted directed evolution (MLDE) strategies have been shown to identify high-fitness protein variants more efficiently than typical wet-lab approaches. However, limited understanding of the factors influencing MLDE performance across diverse proteins has hindered optimal strategy selection for campaigns. To address this, we systematically analyzed multiple strategies, including active learning and focused training using six distinct zero-shot predictors, 16 fitness landscapes. By quantifying landscape navigability with attributes, found that offers a greater advantage on landscapes which are challenging evolution, especially when is combined learning. Despite varying levels landscapes, predictors leveraging evolutionary, structural, stability knowledge sources consistently outperforms random sampling both binding interactions enzyme activities. Our findings provide practical guidelines selecting engineering.

Language: Английский

Citations

1

Deciphering GB1's Single Mutational Landscape: Insights from MuMi Analysis DOI Creative Commons
Tandac F. Guclu, Ali Rana Atılgan, Canan Atılgan

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: June 3, 2024

ABSTRACT Mutational changes that affect the binding of C2 fragment Streptococcal protein G (GB1) to Fc domain human IgG (IgG-Fc) have been extensively studied using deep mutational scanning (DMS), and affinity all single mutations has measured experimentally in literature. To investigate underlying molecular basis, we perform in-silico for possible mutations, along with 2-µs-long dynamics (WT-MD) wild-type (WT) GB1 both unbound IgG-Fc bound forms. We compute hydrogen bonds between WT-MD identify dominant binding, which then assess conformations produced by Mutation Minimization (MuMi) explain fitness landscape binding. Furthermore, analyze MuMi focusing on relative solvent accessibility (RSA) residues probability being located at interface. With these analyses, interactions display structural features Our findings pave way improved predictive accuracy stability interaction studies, are crucial advancements drug design synthetic biology.

Language: Английский

Citations

0

Deciphering GB1’s Single Mutational Landscape: Insights from MuMi Analysis DOI Creative Commons
Tandac F. Guclu, Ali Rana Atılgan, Canan Atılgan

et al.

The Journal of Physical Chemistry B, Journal Year: 2024, Volume and Issue: 128(33), P. 7987 - 7996

Published: Aug. 8, 2024

Mutational changes that affect the binding of C2 fragment Streptococcal protein G (GB1) to Fc domain human IgG (IgG-Fc) have been extensively studied using deep mutational scanning (DMS), and affinity all single mutations has measured experimentally in literature. To investigate underlying molecular basis, we perform silico for possible mutations, along with 2 μs-long dynamics (WT-MD) wild-type (WT) GB1 both unbound IgG-Fc bound forms. We compute hydrogen bonds between WT-MD identify dominant binding, which then assess conformations produced by Mutation Minimization (MuMi) explain fitness landscape binding. Furthermore, analyze MuMi focusing on relative solvent accessibility residues probability being located at interface. With these analyses, interactions display structural features In sum, our findings highlight potential as a reliable computationally efficient tool predicting landscapes, offering significant advantages over traditional methods. The methodologies results presented this study pave way improved predictive accuracy stability interaction studies, are crucial advancements drug design synthetic biology.

Language: Английский

Citations

0

Designing diverse and high-performance proteins with a large language model in the loop DOI Creative Commons
Carlos Alberto Gomez-Uribe, Japheth E. Gado, Meiirbek Islamov

et al.

Published: Oct. 29, 2024

Abstract We present a novel protein engineering approach to directed evolution with machine learning that integrates new semi-supervised neural network fitness prediction model, Seq2Fitness, and an innovative optimization algorithm, b iphasic nnealing for d iverse daptive s equence ampling (BADASS) design sequences. Seq2Fitness leverages language models predict landscapes, combining evolutionary data experimental labels, while BADASS efficiently explores these landscapes by dynamically adjusting temperature mutation energies prevent premature convergence find diverse high-fitness predictions improve the Spearman correlation measurements over alternative model predictions, e.g., from 0.34 0.55 sequences mutations residues are absent training set. requires less memory computation compared gradient-based Markov Chain Monte Carlo methods, finding more higher-fitness maintaining sequence diversity in tasks two different families hundreds of amino acids. For example, both 100% top 10,000 found have higher than wildtype sequence, versus broad range between 3% 99% competing approaches often many fewer found. The top, 100th, 1,000th all also higher. In addition, we developed theoretical framework explain where comes from, why it works, how behaves. Although only evaluate here on acid sequences, may be broadly useful exploration other spaces, including DNA RNA. To ensure reproducibility facilitate adoption, our code is publicly available . Author summary Designing proteins enhanced properties essential applications, industrial enzymes therapeutic molecules. However, traditional methods fail explore vast space effectively, partly due rarity this work, introduce BADASS, algorithm samples probability distribution parameter updated dynamically, alternating cooling heating phases, discover diversity. This stands contrast like simulated annealing, which converge lower solutions, (MCMC), converging solutions at significantly computational cost. Our forward evaluations no gradient computations, enabling rapid high-performing can validated lab, especially when combined models. represents significant advance engineering, opening possibilities applications.

Language: Английский

Citations

0