bioRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2024,
Номер
unknown
Опубликована: Окт. 15, 2024
ABSTRACT
AlphaFold2
(AF2),
a
deep-learning
based
model
that
predicts
protein
structures
from
their
amino
acid
sequences,
has
recently
been
used
to
predict
multiple
conformations.
In
some
cases,
AF2
successfully
predicted
both
dominant
and
alternative
conformations
of
fold-switching
proteins,
which
remodel
secondary
tertiary
in
response
cellular
stimuli.
Whether
learned
enough
folding
principles
reliably
outside
its
training
set
is
unclear.
Here,
we
address
this
question
by
assessing
whether
CFold–an
implementation
the
network
trained
on
more
limited
subset
experimentally
determined
structures–
eight
fold
switchers
six
families.
Previous
work
suggests
these
memorizing
them
during
training.
Unlike
AF2,
CFold’s
contains
only
one
Despite
sampling
1300-4400
structures/protein
with
various
sequence
techniques,
CFold
structure
accurately
high
confidence
while
also
generating
inconsistent
higher
confidence.
Though
results
indicate
AF2’s
current
success
predicting
stems
largely
data,
pruning
technique
suggest
developments
could
lead
reliable
generative
future.
Nature Communications,
Год журнала:
2024,
Номер
15(1)
Опубликована: Авг. 24, 2024
Abstract
Recent
work
suggests
that
AlphaFold
(AF)–a
deep
learning-based
model
can
accurately
infer
protein
structure
from
sequence–may
discern
important
features
of
folded
energy
landscapes,
defined
by
the
diversity
and
frequency
different
conformations
in
state.
Here,
we
test
limits
its
predictive
power
on
fold-switching
proteins,
which
assume
two
structures
with
regions
distinct
secondary
and/or
tertiary
structure.
We
find
(1)
AF
is
a
weak
predictor
fold
switching
(2)
some
successes
result
memorization
training-set
rather
than
learned
energetics.
Combining
>280,000
models
several
implementations
AF2
AF3,
35%
success
rate
was
achieved
for
switchers
likely
AF’s
training
sets.
AF2’s
confidence
metrics
selected
against
consistent
experimentally
determined
failed
to
discriminate
between
low
high
conformations.
Further,
captured
only
one
out
seven
confirmed
outside
sets
despite
extensive
sampling
an
additional
~280,000
models.
Several
observations
indicate
has
memorized
structural
information
during
training,
AF3
misassigns
coevolutionary
restraints.
These
limitations
constrain
scope
successful
predictions,
highlighting
need
physically
based
methods
readily
predict
multiple
Current Opinion in Structural Biology,
Год журнала:
2025,
Номер
90, С. 102973 - 102973
Опубликована: Янв. 5, 2025
In
recent
years,
advances
in
artificial
intelligence
(AI)
have
transformed
structural
biology,
particularly
protein
structure
prediction.
Though
AI-based
methods,
such
as
AlphaFold
(AF),
often
predict
single
conformations
of
proteins
with
high
accuracy
and
confidence,
predictions
alternative
folds
are
inaccurate,
low-confidence,
or
simply
not
predicted
at
all.
Here,
we
review
three
blind
spots
that
reveal
about
AF-based
First,
assume
distinct
from
their
training-set
homologs
can
be
mispredicted.
Second,
AF
overrelies
on
its
training
set
to
conformations.
Third,
degeneracies
pairwise
representations
lead
high-confidence
inconsistent
experiment.
These
weaknesses
suggest
approaches
more
reliably.
Journal of Chemical Theory and Computation,
Год журнала:
2024,
Номер
20(12), С. 5352 - 5367
Опубликована: Июнь 11, 2024
Markov
state
models
(MSMs)
have
proven
valuable
in
studying
dynamics
of
protein
conformational
changes
via
statistical
analysis
molecular
(MD)
simulations.
In
MSMs,
the
complex
configuration
space
is
coarse-grained
into
states,
with
modeled
by
a
series
Markovian
transitions
among
these
states
at
discrete
lag
times.
Constructing
model
specific
time
necessitates
defining
that
circumvent
significant
internal
energy
barriers,
enabling
relaxation
within
time.
This
process
effectively
coarse-grains
and
space,
integrating
out
rapid
motions
metastable
states.
Thus,
MSMs
possess
multi-resolution
nature,
where
granularity
can
be
adjusted
according
to
time-resolution,
offering
flexibility
capturing
system
dynamics.
work
introduces
continuous
embedding
approach
for
conformations
using
predictive
information
bottleneck
(SPIB),
framework
unifies
dimensionality
reduction
partitioning
continuous,
machine
learned
basis
set.
Without
explicit
optimization
VAMP-based
scores,
SPIB
demonstrates
state-of-the-art
performance
identifying
slow
dynamical
processes
constructing
models.
Through
applications
well-validated
mini-proteins,
showcases
unique
advantages
compared
competing
methods.
It
autonomously
self-consistently
adjusts
number
based
on
specified
minimal
resolution,
eliminating
need
manual
tuning.
While
maintaining
efficacy
properties,
excels
accurately
distinguishing
numerous
well-populated
macrostates.
contrasts
existing
methods,
which
often
emphasize
expense
incorporating
sparsely
populated
Furthermore,
SPIB's
ability
learn
low-dimensional
underlying
enhances
interpretation
dynamic
pathways.
With
benefits,
we
propose
as
an
easy-to-implement
methodology
end-to-end
construction.
bioRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2023,
Номер
unknown
Опубликована: Дек. 13, 2023
Abstract
Recent
work
suggests
that
AlphaFold2
(AF2)–a
deep
learning-based
model
can
accurately
infer
protein
structure
from
sequence–may
discern
important
features
of
folded
energy
landscapes,
defined
by
the
diversity
and
frequency
different
conformations
in
state.
Here,
we
test
limits
its
predictive
power
on
fold-switching
proteins,
which
assume
two
structures
with
regions
distinct
secondary
and/or
tertiary
structure.
Using
several
implementations
AF2,
including
published
enhanced
sampling
approaches,
generated
>280,000
models
93
proteins
whose
experimentally
determined
were
likely
AF2’s
training
set.
Combining
all
models,
AF2
predicted
fold
switching
a
modest
success
rate
∼25%,
indicating
it
does
not
readily
sample
both
characterized
most
switchers.
Further,
confidence
metrics
selected
against
consistent
favor
inconsistent
models.
Accordingly,
these
metrics–though
suggested
to
evaluate
energetics
reliably–did
discriminate
between
low
high
states
proteins.
We
then
evaluated
performance
seven
outside
set,
generating
>159,000
total.
Fold
was
one
targets
moderate
confidence.
demonstrated
no
ability
predict
alternative
newly
discovered
without
homologs
set
These
results
indicate
has
more
learn
about
underlying
ensembles
highlight
need
for
further
developments
methods
multiple
conformations.
Nature Communications,
Год журнала:
2024,
Номер
15(1)
Опубликована: Авг. 2, 2024
Protein
kinases
are
molecular
machines
with
rich
sequence
variation
that
distinguishes
the
two
main
evolutionary
branches
–
tyrosine
(TKs)
from
serine/threonine
(STKs).
Using
a
co-variation
Potts
statistical
energy
model
we
previously
concluded
TK
catalytic
domains
more
likely
than
STKs
to
adopt
an
inactive
conformation
activation
loop
in
autoinhibitory
folded
conformation,
due
intrinsic
effects.
Here
investigate
structural
basis
for
this
phenomenon
by
integrating
sequence-based
structure-based
dynamics
(MD)
determine
effects
of
mutations
on
free
difference
between
active
and
conformations,
using
thermodynamic
cycle
involving
many
(n
=
108)
protein-mutation
perturbation
(FEP)
simulations
conformations.
The
results
consistent
support
hypothesis
DFG-out
Activation
Loop
Folded,
is
functional
regulatory
state
has
been
stabilized
TKs
relative
over
course
their
evolution
via
accumulation
residue
substitutions
facilitate
distinct
substrate
binding
modes
trans
additional
regulation
cis
TKs.
In
study,
authors
identify
mechanism
conformational
preferences
vs
suggest
kinase
function
can
explain
these
differences.
PLoS Computational Biology,
Год журнала:
2024,
Номер
20(7), С. e1012302 - e1012302
Опубликована: Июль 24, 2024
Protein
kinase
function
and
interactions
with
drugs
are
controlled
in
part
by
the
movement
of
DFG
ɑC-Helix
motifs
that
related
to
catalytic
activity
kinase.
Small
molecule
ligands
elicit
therapeutic
effects
distinct
selectivity
profiles
residence
times
often
depend
on
active
or
inactive
conformation(s)
they
bind.
Modern
AI-based
structural
modeling
methods
have
potential
expand
upon
limited
availability
experimentally
determined
structures
states.
Here,
we
first
explored
conformational
space
kinases
PDB
models
generated
AlphaFold2
(AF2)
ESMFold,
two
prominent
protein
structure
prediction
methods.
Our
investigation
AF2’s
ability
explore
diversity
kinome
at
various
multiple
sequence
alignment
(MSA)
depths
showed
a
bias
within
predicted
DFG-in
conformations,
particularly
those
motif,
based
their
overabundance
PDB.
We
demonstrate
predicting
using
AF2
lower
MSA
these
alternative
conformations
more
extensively,
including
identifying
previously
unobserved
for
398
kinases.
Ligand
enrichment
analyses
23
that,
average,
docked
distinguished
between
molecules
decoys
better
than
random
(average
AUC
(avgAUC)
64.58),
but
select
perform
well
(e.g.,
avgAUCs
PTK2
JAK2
were
79.28
80.16,
respectively).
Further
analysis
explained
ligand
discrepancy
low-
high-performing
as
binding
site
occlusions
would
preclude
docking.
The
overall
results
our
suggested
although
uncharted
regions
exhibited
scores
suitable
rational
drug
discovery,
rigorous
refinement
is
likely
still
necessary
discovery
campaigns.
Bioinformatics Advances,
Год журнала:
2023,
Номер
3(1)
Опубликована: Янв. 1, 2023
Protein
kinases
are
a
family
of
signaling
proteins,
crucial
for
maintaining
cellular
homeostasis.
When
dysregulated,
drive
the
pathogenesis
several
diseases,
and
thus
one
largest
target
categories
drug
discovery.
Kinase
activity
is
tightly
controlled
by
switching
through
active
inactive
conformations
in
their
catalytic
domain.
inhibitors
have
been
designed
to
engage
specific
conformational
states,
where
each
conformation
presents
unique
physico-chemical
environment
therapeutic
intervention.
Thus,
modeling
across
can
enable
design
novel
optimally
selective
kinase
drugs.
Due
recent
success
AlphaFold2
accurately
predicting
3D
structure
proteins
based
on
sequence,
we
investigated
landscape
protein
as
modeled
AlphaFold2.
We
observed
that
able
model
kinome,
however,
certain
only
families.
Furthermore,
show
per
residue
predicted
local
distance
difference
test
capture
information
describing
structural
flexibility
kinases.
Finally,
evaluated
docking
performance
structures
enriching
known
ligands.
Taken
together,
see
an
opportunity
leverage
models
structure-based
discovery
against
pharmacologically
relevant
states.
bioRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2023,
Номер
unknown
Опубликована: Ноя. 22, 2023
Abstract
Though
typically
associated
with
a
single
folded
state,
globular
proteins
are
dynamic
and
often
assume
alternative
or
transient
structures
important
for
their
functions
1,2
.
Wayment-Steele,
et
al.
steered
ColabFold
3
to
predict
of
several
using
method
they
call
AF-cluster
4
They
propose
that
“enables
sample
alternate
states
known
metamorphic
high
confidence”
by
first
clustering
multiple
sequence
alignments
(MSAs)
in
way
“deconvolves”
coevolutionary
information
specific
different
conformations
then
these
clusters
as
input
ColabFold.
Contrary
this
Coevolution
Assumption,
clustered
MSAs
not
needed
make
predictions.
Rather,
can
be
predicted
from
sequences
and/or
similarity,
indicating
is
unnecessary
predictive
success
may
used
at
all.
These
results
suggest
AF-cluster’s
scope
likely
limited
distinct-yet-homologous
within
ColabFold’s
training
set.