Integrating sequence and chemical insights: a co-modeling AI prediction framework for peptides DOI Open Access
Zihan Liu, Min Yan, Zhihui Zhu

et al.

Journal of Materials Informatics, Journal Year: 2025, Volume and Issue: 5(2)

Published: Feb. 27, 2025

Understanding the impact of primary structure peptides on a range physicochemical properties is crucial for development various applications. Peptides can be conceptualized as sequences amino acids in their biological representation and molecular architectures composed atoms chemical bonds representation. This study examines influence different representations local interpretability accuracy respective prediction models has developed “feature attribution” methodologies based these representations. The effectiveness validated through analyses, specifically within context peptide aggregation propensity (AP) prediction, with training datasets derived from high-throughput dynamics (MD) simulations. Our findings reveal significant discrepancies attribution extracted sequence-based structure-based representations, which led to proposal co-modeling framework that integrates insights both perspectives. Empirical comparisons have demonstrated contrastive learning-based excels terms efficiency. research not only extends applicability method but also lays groundwork elucidating intrinsic mechanisms governing activities functions aid domain-specific knowledge. Moreover, strategy poised enhance precision downstream applications facilitate future endeavors drug discovery protein engineering.

Language: Английский

How Does Sampling Affect the AI Prediction Accuracy of Peptides' Physicochemical Properties? DOI Open Access
Min Yan,

Ankeer Abuduhebaier,

Hao Zhou

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2025, Volume and Issue: unknown

Published: Feb. 2, 2025

Abstract Accurate AI prediction of peptide physicochemical properties is essential for advancing peptide-based biomedicine, biotechnology, and bioengineering. However, the performance predictive models significantly affected by representativeness training data, which depends on sample size sampling methods employed. This study addresses challenge determining optimal to enhance accuracy generalization capacity estimating aggregation propensity, hydrophilicity, isoelectric point tetrapeptides. Four were evaluated: Latin Hypercube Sampling (LHS), Uniform Design (UDS), Simple Random (SRS), Probability-Proportional-to-Size (PPS), across sizes ranging from 100 20,000. A approximately 12,000 (7.5% total tetrapeptide dataset) marks a key threshold stable consistent model performance. provides valuable insights into interplay between size, strategies, performance, offering foundational framework optimizing data collection peptides’ properties, especially in complete sequence space longer peptides with more than four amino acids.

Language: Английский

Citations

0

Interplay of Hydrophobicity, Charge, and Sequence Length in Oligopeptide Coassembly DOI

Subhadra Thapa,

Anshul Gahlawat,

Severin T. Schneebeli

et al.

The Journal of Physical Chemistry B, Journal Year: 2025, Volume and Issue: unknown

Published: April 23, 2025

Peptide coassembly offers novel opportunities for designing advanced nanomaterials. This study used coarse-grained molecular dynamics simulations to examine the of charge-complementary peptides, assessing various ratios and role charge hydrophobicity in their aggregation. We discovered that peptide length, charge, significantly influence behavior, with more hydrophobic peptides exhibiting greater aggregation despite electrostatic repulsion. Beyond two we also observed than will likely lead new assembly structures properties. Our findings underscore importance composition length tuning resulting properties, thus facilitating design complex nanoparticles biomedical biotechnological applications.

Language: Английский

Citations

0

Integrating sequence and chemical insights: a co-modeling AI prediction framework for peptides DOI Open Access
Zihan Liu, Min Yan, Zhihui Zhu

et al.

Journal of Materials Informatics, Journal Year: 2025, Volume and Issue: 5(2)

Published: Feb. 27, 2025

Understanding the impact of primary structure peptides on a range physicochemical properties is crucial for development various applications. Peptides can be conceptualized as sequences amino acids in their biological representation and molecular architectures composed atoms chemical bonds representation. This study examines influence different representations local interpretability accuracy respective prediction models has developed “feature attribution” methodologies based these representations. The effectiveness validated through analyses, specifically within context peptide aggregation propensity (AP) prediction, with training datasets derived from high-throughput dynamics (MD) simulations. Our findings reveal significant discrepancies attribution extracted sequence-based structure-based representations, which led to proposal co-modeling framework that integrates insights both perspectives. Empirical comparisons have demonstrated contrastive learning-based excels terms efficiency. research not only extends applicability method but also lays groundwork elucidating intrinsic mechanisms governing activities functions aid domain-specific knowledge. Moreover, strategy poised enhance precision downstream applications facilitate future endeavors drug discovery protein engineering.

Language: Английский

Citations

0