How Does Sampling Affect the AI Prediction Accuracy of Peptides' Physicochemical Properties?
Min Yan,
No information about this author
Ankeer Abuduhebaier,
No information about this author
Hao Zhou
No information about this author
et al.
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2025,
Volume and Issue:
unknown
Published: Feb. 2, 2025
Abstract
Accurate
AI
prediction
of
peptide
physicochemical
properties
is
essential
for
advancing
peptide-based
biomedicine,
biotechnology,
and
bioengineering.
However,
the
performance
predictive
models
significantly
affected
by
representativeness
training
data,
which
depends
on
sample
size
sampling
methods
employed.
This
study
addresses
challenge
determining
optimal
to
enhance
accuracy
generalization
capacity
estimating
aggregation
propensity,
hydrophilicity,
isoelectric
point
tetrapeptides.
Four
were
evaluated:
Latin
Hypercube
Sampling
(LHS),
Uniform
Design
(UDS),
Simple
Random
(SRS),
Probability-Proportional-to-Size
(PPS),
across
sizes
ranging
from
100
20,000.
A
approximately
12,000
(7.5%
total
tetrapeptide
dataset)
marks
a
key
threshold
stable
consistent
model
performance.
provides
valuable
insights
into
interplay
between
size,
strategies,
performance,
offering
foundational
framework
optimizing
data
collection
peptides’
properties,
especially
in
complete
sequence
space
longer
peptides
with
more
than
four
amino
acids.
Language: Английский
Interplay of Hydrophobicity, Charge, and Sequence Length in Oligopeptide Coassembly
Subhadra Thapa,
No information about this author
Anshul Gahlawat,
No information about this author
Severin T. Schneebeli
No information about this author
et al.
The Journal of Physical Chemistry B,
Journal Year:
2025,
Volume and Issue:
unknown
Published: April 23, 2025
Peptide
coassembly
offers
novel
opportunities
for
designing
advanced
nanomaterials.
This
study
used
coarse-grained
molecular
dynamics
simulations
to
examine
the
of
charge-complementary
peptides,
assessing
various
ratios
and
role
charge
hydrophobicity
in
their
aggregation.
We
discovered
that
peptide
length,
charge,
significantly
influence
behavior,
with
more
hydrophobic
peptides
exhibiting
greater
aggregation
despite
electrostatic
repulsion.
Beyond
two
we
also
observed
than
will
likely
lead
new
assembly
structures
properties.
Our
findings
underscore
importance
composition
length
tuning
resulting
properties,
thus
facilitating
design
complex
nanoparticles
biomedical
biotechnological
applications.
Language: Английский
Integrating sequence and chemical insights: a co-modeling AI prediction framework for peptides
Journal of Materials Informatics,
Journal Year:
2025,
Volume and Issue:
5(2)
Published: Feb. 27, 2025
Understanding
the
impact
of
primary
structure
peptides
on
a
range
physicochemical
properties
is
crucial
for
development
various
applications.
Peptides
can
be
conceptualized
as
sequences
amino
acids
in
their
biological
representation
and
molecular
architectures
composed
atoms
chemical
bonds
representation.
This
study
examines
influence
different
representations
local
interpretability
accuracy
respective
prediction
models
has
developed
“feature
attribution”
methodologies
based
these
representations.
The
effectiveness
validated
through
analyses,
specifically
within
context
peptide
aggregation
propensity
(AP)
prediction,
with
training
datasets
derived
from
high-throughput
dynamics
(MD)
simulations.
Our
findings
reveal
significant
discrepancies
attribution
extracted
sequence-based
structure-based
representations,
which
led
to
proposal
co-modeling
framework
that
integrates
insights
both
perspectives.
Empirical
comparisons
have
demonstrated
contrastive
learning-based
excels
terms
efficiency.
research
not
only
extends
applicability
method
but
also
lays
groundwork
elucidating
intrinsic
mechanisms
governing
activities
functions
aid
domain-specific
knowledge.
Moreover,
strategy
poised
enhance
precision
downstream
applications
facilitate
future
endeavors
drug
discovery
protein
engineering.
Language: Английский