Accelerated Missense Mutation Identification in Intrinsically Disordered Proteins using Deep Learning DOI Creative Commons
Swarnadeep Seth, Aniket Bhattacharya

Published: July 10, 2024

ABSTRACT We use a combination of Brownian dynamics (BD) simulation results and Deep Learning (DL) strategies for rapid identification large structural changes caused by missense mutations in intrinsically disordered proteins (IDPs). 2000 IDP sequences from DisProt database length 20 −300 are used to obtain gyration radii BD on coarse-grained single bead amino acid model (HPS model) us others [Seth et al . J. Chem. Phys. 160 , 014902 (2024), Dignon PLOS Comp. Biology, 14, 2018, Tesei PNAS, 118, 2021] generate the training sets DL algorithm. Using ⟨ R g ⟩ simulated IDPs as set, we develop multilayer perceptron neural net (NN) architecture that predicts 33 previously studied using with 95% accuracy sequence corresponding parameters HPS model. now utilize this NN predict every permutation IDPs. Our approach successfully identifies mutation-prone regions induce significant alterations radius when compared wild-type sequence. further validate prediction running simulations subset identified mutants. The network yields (10 4 − 10 5 )-fold faster computation search space potentially harmful mutations. findings have substantial implications understanding diseases related development potential therapeutic interventions. method can be extended accurate predictions other mutation effects proteins.

Language: Английский

Accelerated Missense Mutation Identification in Intrinsically Disordered Proteins Using Deep Learning DOI
Swarnadeep Seth, Aniket Bhattacharya

Biomacromolecules, Journal Year: 2025, Volume and Issue: unknown

Published: March 12, 2025

We use a combination of Brownian dynamics (BD) simulation results and deep learning (DL) strategies for the rapid identification large structural changes caused by missense mutations in intrinsically disordered proteins (IDPs). used ∼6500 IDP sequences from MobiDB database length 20–300 to obtain gyration radii BD on coarse-grained single-bead amino acid model (HPS2 model) us others [Dignon, G. L. PLoS Comput. Biol. 2018, 14, e1005941,Tesei, Proc. Natl. Acad. Sci. U.S.A. 2021, 118, e2111696118,Seth, S. J. Chem. Phys. 2024, 160, 014902] generate training sets DL algorithm. Using ⟨Rg⟩ simulated IDPs as set, we develop multilayer perceptron neural net (NN) architecture that predicts 33 previously studied using with 97% accuracy sequence corresponding parameters HPS model. now utilize this NN predict every permutation IDPs. Our approach successfully identifies mutation-prone regions induce significant alterations radius when compared wild-type sequence. further validate prediction running simulations subset identified mutants. The network yields (104–106)-fold faster computation search space potentially harmful mutations. findings have substantial implications understanding diseases related development potential therapeutic interventions. method can be extended accurate predictions other mutation effects proteins.

Language: Английский

Citations

0

SOP-MULTI: A Self-Organized Polymer-Based Coarse-Grained Model for Multidomain and Intrinsically Disordered Proteins with Conformation Ensemble Consistent with Experimental Scattering Data DOI
Krishnakanth Baratam, Anand Srivastava

Journal of Chemical Theory and Computation, Journal Year: 2024, Volume and Issue: 20(22), P. 10179 - 10198

Published: Nov. 5, 2024

Multidomain proteins with long flexible linkers and full-length intrinsically disordered (IDPs) are best defined as an ensemble of conformations rather than a single structure. Determining high-resolution structures such poses various challenges by using tools from experimental structural biophysics. Integrative approaches combining available low-resolution ensemble-averaged data in silico biomolecular reconstructions now often used for the purpose. However, extensive Boltzmann weighted conformation sampling large proteins, especially ones where both folded domains exist same polypeptide chain, remains challenge. In this work, we present 2-site per amino-acid resolution SOP-MULTI force field simulating coarse-grained models multidomain proteins. combines two well-established self-organized polymer models─: (i) SOP-SC systems (ii) SOP-IDP IDPs. For SOP-MULTI, introduce cross-interaction terms between beads belonging to regions generate ensembles hnRNP A1, TDP-43, G3BP1, hGHR-ECD, TIA1, HIV-1 Gag, polyubiquitin, FUS. When back-mapped all-atom resolution, trajectories faithfully recapitulate scattering over range reciprocal space. We also show that individual preserve native contacts respect solved structures, root-mean-square fluctuations residues match those obtained molecular dynamics simulation systems. is made LAMMPS-compatible user package along setup codes generating required files any protein regions.

Language: Английский

Citations

3

Linear and Nonlinear Dielectric Response of Intrinsically Disordered Proteins DOI

Michael A. Sauer,

Taylor Colburn, Sthitadhi Maiti

et al.

The Journal of Physical Chemistry Letters, Journal Year: 2024, Volume and Issue: 15(20), P. 5420 - 5427

Published: May 14, 2024

Linear and nonlinear dielectric responses of solutions intrinsically disordered proteins (IDPs) were analyzed by combining molecular dynamics simulations with formal theories. A large increment the linear function over that solvent is found related to dipole moments IDPs. The effect (NDE) IDP far exceeds bulk electrolyte, offering a route interrogate protein conformational rotational statistics dynamics. Conformational flexibility makes moment consistent gamma/log-normal distributions contributes NDE through moment's non-Gaussian parameter. intrinsic parameter combines osmotic compressibility in susceptibility when dipolar correlations are screened electrolyte. dominated electrolyte screening reduced.

Language: Английский

Citations

1

SOP-MULTI: A self-organized polymer based coarse-grained model for multi-domain and intrinsically disordered proteins with conformation ensemble consistent with experimental scattering data DOI Creative Commons
Krishnakanth Baratam, Anand Srivastava

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: May 2, 2024

Abstract Multidomain proteins with long flexible linkers and full-length intrinsically disordered (IDPs) are best defined as an ensemble of conformations rather than a single structure. Determining high-resolution structures such poses various challenges using tools from experimental structural biophysics. Integrative approaches combining available low-resolution ensemble-averaged data in silico biomolecular reconstructions now often used for the purpose. However, exhaustive Boltzmann weighted conformation sampling large proteins, especially ones where both folded domains exist same polypeptide chain, remains challenge. In this work, we present 2-site per amino-acid resolution SOP-MULTI force field simulating coarse-grained models multidomain proteins. combines two well-established self-organized polymer (SOP) —: (i) SOP-SC systems (ii) SOP-IDP IDPs. For SOP-MULTI, train cross-interaction terms between beads belonging to regions generate experimentally-consistent ensembles multi-domain hnRNPA1, TDP-43, G3BP1, hGHR-ECD, TIA1, HIV-1 Gag, Poly-Ubiquitin FUS. When back-mapped all-atom resolution, trajectories faithfully recapitulate scattering over range reciprocal space. We also show that individual preserve native contacts respect solved structures, root mean square fluctuations residues match those obtained molecular dynamics simulations systems. Force Field is made LAMMPS-compatible user package along setup codes generating required files any protein regions.

Language: Английский

Citations

0

Accelerated Missense Mutation Identification in Intrinsically Disordered Proteins using Deep Learning DOI Creative Commons
Swarnadeep Seth, Aniket Bhattacharya

Published: July 10, 2024

ABSTRACT We use a combination of Brownian dynamics (BD) simulation results and Deep Learning (DL) strategies for rapid identification large structural changes caused by missense mutations in intrinsically disordered proteins (IDPs). 2000 IDP sequences from DisProt database length 20 −300 are used to obtain gyration radii BD on coarse-grained single bead amino acid model (HPS model) us others [Seth et al . J. Chem. Phys. 160 , 014902 (2024), Dignon PLOS Comp. Biology, 14, 2018, Tesei PNAS, 118, 2021] generate the training sets DL algorithm. Using ⟨ R g ⟩ simulated IDPs as set, we develop multilayer perceptron neural net (NN) architecture that predicts 33 previously studied using with 95% accuracy sequence corresponding parameters HPS model. now utilize this NN predict every permutation IDPs. Our approach successfully identifies mutation-prone regions induce significant alterations radius when compared wild-type sequence. further validate prediction running simulations subset identified mutants. The network yields (10 4 − 10 5 )-fold faster computation search space potentially harmful mutations. findings have substantial implications understanding diseases related development potential therapeutic interventions. method can be extended accurate predictions other mutation effects proteins.

Language: Английский

Citations

0