Briefings in Bioinformatics,
Journal Year:
2023,
Volume and Issue:
24(4)
Published: July 1, 2023
Proteins
are
dynamic
macromolecules
that
perform
vital
functions
in
cells.
A
protein
structure
determines
its
function,
but
this
is
not
static,
as
proteins
change
their
conformation
to
achieve
various
functions.
Understanding
the
conformational
landscapes
of
essential
understand
mechanism
action.
Sets
carefully
chosen
conformations
can
summarize
such
complex
and
provide
better
insights
into
function
than
single
conformations.
We
refer
these
sets
representative
ensembles.
Recent
advances
computational
methods
have
led
an
increase
number
available
structural
datasets
spanning
landscapes.
However,
extracting
ensembles
from
easy
task
many
been
developed
tackle
it.
Our
new
approach,
EnGens
(short
for
ensemble
generation),
collects
a
unified
framework
generating
analyzing
In
work,
we:
(1)
overview
existing
tools
generation
analysis;
(2)
unify
approaches
open-source
Python
package,
portable
Docker
image,
providing
interactive
visualizations
within
Jupyter
Notebook
pipeline;
(3)
test
our
pipeline
on
few
canonical
examples
literature.
Representative
produced
by
be
used
downstream
tasks
protein-ligand
docking,
Markov
state
modeling
dynamics
analysis
effect
single-point
mutations.
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2022,
Volume and Issue:
unknown
Published: Nov. 22, 2022
Abstract
AlphaFold2
revolutionized
structural
biology
with
the
ability
to
predict
protein
structures
exceptionally
high
accuracy.
Its
implementation,
however,
lacks
code
and
data
required
train
new
models.
These
are
necessary
(i)
tackle
tasks,
like
protein-ligand
complex
structure
prediction,
(ii)
investigate
process
by
which
model
learns,
remains
poorly
understood,
(iii)
assess
model’s
generalization
capacity
unseen
regions
of
fold
space.
Here
we
report
OpenFold,
a
fast,
memory-efficient,
trainable
implementation
AlphaFold2.
We
OpenFold
from
scratch,
fully
matching
accuracy
Having
established
parity,
OpenFold’s
generalize
across
space
retraining
it
using
carefully
designed
datasets.
find
that
is
remarkably
robust
at
generalizing
despite
extreme
reductions
in
training
set
size
diversity,
including
near-complete
elisions
classes
secondary
elements.
By
analyzing
intermediate
produced
during
training,
also
gain
surprising
insights
into
manner
learns
proteins,
discovering
spatial
dimensions
learned
sequentially.
Taken
together,
our
studies
demonstrate
power
utility
believe
will
prove
be
crucial
resource
for
modeling
community.
Nature Biotechnology,
Journal Year:
2023,
Volume and Issue:
41(12), P. 1810 - 1819
Published: March 20, 2023
While
AlphaFold2
can
predict
accurate
protein
structures
from
the
primary
sequence,
challenges
remain
for
proteins
that
undergo
conformational
changes
or
which
few
homologous
sequences
are
known.
Here
we
introduce
AlphaLink,
a
modified
version
of
algorithm
incorporates
experimental
distance
restraint
information
into
its
network
architecture.
By
employing
sparse
contacts
as
anchor
points,
AlphaLink
improves
on
performance
in
predicting
challenging
targets.
We
confirm
this
experimentally
by
using
noncanonical
amino
acid
photo-leucine
to
obtain
residue-residue
inside
cells
crosslinking
mass
spectrometry.
The
program
distinct
conformations
basis
restraints
provided,
demonstrating
value
data
driving
structure
prediction.
noise-tolerant
framework
integrating
prediction
presented
here
opens
path
characterization
in-cell
data.
Frontiers in Molecular Biosciences,
Journal Year:
2023,
Volume and Issue:
10
Published: Feb. 16, 2023
Determining
the
three-dimensional
structure
of
proteins
in
their
native
functional
states
has
been
a
longstanding
challenge
structural
biology.
While
integrative
biology
most
effective
way
to
get
high-accuracy
different
conformations
and
mechanistic
insights
for
larger
proteins,
advances
deep
machine-learning
algorithms
have
paved
fully
computational
predictions.
In
this
field,
AlphaFold2
(AF2)
pioneered
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2022,
Volume and Issue:
unknown
Published: Feb. 18, 2022
Abstract
The
AlphaFold
Protein
Structure
Database
contains
predicted
structures
for
millions
of
proteins.
For
the
majority
human
proteins
that
contain
intrinsically
disordered
regions
(IDRs),
which
do
not
adopt
a
stable
structure,
it
is
generally
assumed
these
have
low
AlphaFold2
confidence
scores
reflect
low-confidence
structural
predictions.
Here,
we
show
assigns
confident
to
nearly
15%
IDRs.
By
comparison
experimental
NMR
data
subset
IDRs
are
known
conditionally
fold
(i.e.,
upon
binding
or
under
other
specific
conditions),
find
often
predicts
structure
folded
state.
Based
on
databases
fold,
estimate
can
identify
folding
at
precision
as
high
88%
10%
false
positive
rate,
remarkable
considering
IDR
were
minimally
represented
in
its
training
data.
We
disease
mutations
5-fold
enriched
over
general,
and
up
80%
prokaryotes
compared
less
than
20%
eukaryotic
These
results
indicate
large
proteomes
eukaryotes
function
absence
conditional
folding,
but
acquire
folds
more
sensitive
mutations.
emphasize
predictions
reveal
functionally
relevant
plasticity
within
cannot
offer
realistic
ensemble
representations
Significance
Statement
machine
learning-based
methods
accurately
predict
most
However,
two-thirds
segments
highly
flexible
autonomously
otherwise
(IDRs).
In
interconvert
rapidly
between
number
different
conformations,
posing
significant
problem
protein
prediction
define
one
small
conformations.
found
readily
certain
conditions
(conditional
folding).
leverage
AlphaFold2’s
quantify
extent
across
tree
life,
rationalize
disease-causing
Classifications
:
Biological
Sciences;
Biophysics
Computational
Biology
Proteins Structure Function and Bioinformatics,
Journal Year:
2023,
Volume and Issue:
91(12), P. 1734 - 1746
Published: Aug. 7, 2023
AlphaFold2
has
revolutionized
structure
prediction
by
achieving
high
accuracy
comparable
to
experimentally
determined
structures.
However,
there
is
still
room
for
improvement,
especially
challenging
cases
like
multimers.
A
key
the
success
of
AlphaFold
its
ability
assess
and
rank
own
predictions.
Our
basic
idea
Wallner
group
in
CASP15
was
exploit
this
excellent
scoring
function
massive
sampling.
To
achieve
goal,
we
conducted
runs
using
six
different
settings,
templates,
without
with
an
increased
number
recycles
both
multimer
v1
v2
weights.
In
all
instances,
enabled
dropout
layers
during
inference,
allowing
sampling
uncertainty
enhancing
diversity
generated
models.
total,
274
289
models
were
38
targets
CASP15,
a
median
4810
per
target.
Of
these
targets,
10
quality,
11
medium
acceptable,
only
6
incorrect.
The
improvement
over
baseline
method,
NBIS-AF2-multimer,
substantial,
mean
DockQ
increasing
from
0.43
0.56,
several
showing
score
increase
+0.6
units.
Remarkable,
considering
NBIS-AF2-multimer
identical
input
data.
can
be
attributed
diversified
settings
and,
particular,
use
v1,
which
much
more
susceptible
compared
v2.
method
available
here:
http://wallnerlab.org/AFsample/.
JACS Au,
Journal Year:
2023,
Volume and Issue:
3(6), P. 1554 - 1562
Published: June 6, 2023
The
recent
success
of
AlphaFold2
(AF2)
and
other
deep
learning
(DL)
tools
in
accurately
predicting
the
folded
three-dimensional
(3D)
structure
proteins
enzymes
has
revolutionized
structural
biology
protein
design
fields.
3D
indeed
reveals
key
information
on
arrangement
catalytic
machinery
which
elements
gate
active
site
pocket.
However,
comprehending
enzymatic
activity
requires
a
detailed
knowledge
chemical
steps
involved
along
cycle
exploration
multiple
thermally
accessible
conformations
that
adopt
when
solution.
In
this
Perspective,
some
studies
showing
potential
AF2
elucidating
conformational
landscape
are
provided.
Selected
examples
developments
AF2-based
DL
methods
for
discussed,
as
well
few
enzyme
cases.
These
show
allowing
routine
computational
efficient
enzymes.
Nature Communications,
Journal Year:
2023,
Volume and Issue:
14(1)
Published: Sept. 6, 2023
Although
most
globular
proteins
fold
into
a
single
stable
structure,
an
increasing
number
have
been
shown
to
remodel
their
secondary
and
tertiary
structures
in
response
cellular
stimuli.
State-of-the-art
algorithms
predict
that
these
fold-switching
adopt
only
one
missing
functionally
critical
alternative
folds.
Why
is
unclear,
but
all
of
them
infer
protein
structure
from
coevolved
amino
acid
pairs.
Here,
we
hypothesize
coevolutionary
signatures
are
being
missed.
Suspecting
single-fold
variants
could
be
masking
signatures,
developed
approach,
called
Alternative
Contact
Enhancement
(ACE),
search
both
highly
diverse
superfamilies-composed
variants-and
subfamilies
with
more
variants.
ACE
successfully
revealed
coevolution
pairs
uniquely
corresponding
conformations
56/56
distinct
families.
Then,
used
ACE-derived
contacts
(1)
two
experimentally
consistent
candidate
unsolved
(2)
develop
blind
prediction
pipeline
for
proteins.
The
discovery
widespread
dual-fold
indicates
sequences
preserved
by
natural
selection,
implying
functionalities
provide
evolutionary
advantage
paving
the
way
predictions
sequences.