PLoS Biology,
Journal Year:
2022,
Volume and Issue:
20(5), P. e3001636 - e3001636
Published: May 16, 2022
The
recent
revolution
in
computational
protein
structure
prediction
provides
folding
models
for
entire
proteomes,
which
can
now
be
integrated
with
large-scale
experimental
data.
Mass
spectrometry
(MS)-based
proteomics
has
identified
and
quantified
tens
of
thousands
posttranslational
modifications
(PTMs),
most
them
uncertain
functional
relevance.
In
this
study,
we
determine
the
structural
context
these
PTMs
investigate
how
information
leveraged
to
pinpoint
potential
regulatory
sites.
Our
analysis
uncovers
global
patterns
PTM
occurrence
across
folded
intrinsically
disordered
regions.
We
found
that
help
distinguish
from
those
marking
improperly
proteins.
Interestingly,
human
proteome
contains
proteins
have
large
domains
linked
by
short,
regions
are
strongly
enriched
phosphosites.
These
include
well-known
kinase
activation
loops
induce
conformational
changes
upon
phosphorylation.
This
mechanism
appears
widespread
kinases
but
also
occurs
other
families
such
as
solute
carriers.
It
is
not
limited
phosphorylation
includes
ubiquitination
acetylation
sites
well.
Furthermore,
performed
three-dimensional
proximity
analysis,
revealed
examples
spatial
coregulation
different
types
crosstalk.
To
enable
community
build
first
analyses,
provide
tools
3D
visualization
data
well
python
libraries
accession
processing.
Nucleic Acids Research,
Journal Year:
2022,
Volume and Issue:
51(D1), P. D638 - D646
Published: Nov. 12, 2022
Much
of
the
complexity
within
cells
arises
from
functional
and
regulatory
interactions
among
proteins.
The
core
these
is
increasingly
known,
but
novel
continue
to
be
discovered,
information
remains
scattered
across
different
database
resources,
experimental
modalities
levels
mechanistic
detail.
STRING
(https://string-db.org/)
systematically
collects
integrates
protein-protein
interactions-both
physical
as
well
associations.
data
originate
a
number
sources:
automated
text
mining
scientific
literature,
computational
interaction
predictions
co-expression,
conserved
genomic
context,
databases
experiments
known
complexes/pathways
curated
sources.
All
are
critically
assessed,
scored,
subsequently
automatically
transferred
less
well-studied
organisms
using
hierarchical
orthology
information.
can
accessed
via
website,
also
programmatically
bulk
downloads.
most
recent
developments
in
(version
12.0)
are:
(i)
it
now
possible
create,
browse
analyze
full
network
for
any
genome
interest,
by
submitting
its
complement
encoded
proteins,
(ii)
co-expression
channel
uses
variational
auto-encoders
predict
interactions,
covers
two
new
sources,
single-cell
RNA-seq
proteomics
(iii)
confidence
each
experimentally
derived
estimated
based
on
detection
method
used,
communicated
user
web-interface.
Furthermore,
continues
enhance
facilities
enrichment
analysis,
which
fully
available
user-submitted
genomes.
Nucleic Acids Research,
Journal Year:
2022,
Volume and Issue:
51(D1), P. D488 - D508
Published: Nov. 24, 2022
The
Research
Collaboratory
for
Structural
Bioinformatics
Protein
Data
Bank
(RCSB
PDB),
founding
member
of
the
Worldwide
(wwPDB),
is
US
data
center
open-access
PDB
archive.
As
wwPDB-designated
Archive
Keeper,
RCSB
also
responsible
security.
Annually,
serves
>10
000
depositors
three-dimensional
(3D)
biostructures
working
on
all
permanently
inhabited
continents.
delivers
from
its
research-focused
RCSB.org
web
portal
to
many
millions
consumers
based
in
virtually
every
United
Nations-recognized
country,
territory,
etc.
This
Database
Issue
contribution
describes
upgrades
that
created
a
one-stop-shop
open
access
∼200
experimentally-determined
structures
biological
macromolecules
alongside
>1
incorporated
Computed
Structure
Models
(CSMs)
predicted
using
artificial
intelligence/machine
learning
methods.
'living
resource.'
Every
structure
and
CSM
integrated
weekly
with
related
functional
annotations
external
biodata
resources,
providing
up-to-date
information
entire
corpus
3D
biostructure
freely
available
no
usage
limitations.
Within
RCSB.org,
CSMs
are
clearly
identified
as
their
provenance
reliability.
Both
fully
searchable,
can
be
analyzed
visualized
full
complement
capabilities.
Signal Transduction and Targeted Therapy,
Journal Year:
2023,
Volume and Issue:
8(1)
Published: March 1, 2023
The
TP53
tumor
suppressor
is
the
most
frequently
altered
gene
in
human
cancers,
and
has
been
a
major
focus
of
oncology
research.
p53
protein
transcription
factor
that
can
activate
expression
multiple
target
genes
plays
critical
roles
regulating
cell
cycle,
apoptosis,
genomic
stability,
widely
regarded
as
"guardian
genome".
Accumulating
evidence
shown
also
regulates
metabolism,
ferroptosis,
microenvironment,
autophagy
so
on,
all
which
contribute
to
suppression.
Mutations
not
only
impair
its
function,
but
confer
oncogenic
properties
mutants.
Since
mutated
inactivated
malignant
tumors,
it
very
attractive
for
developing
new
anti-cancer
drugs.
However,
until
recently,
was
considered
an
"undruggable"
little
progress
made
with
p53-targeted
therapies.
Here,
we
provide
systematic
review
diverse
molecular
mechanisms
signaling
pathway
how
mutations
impact
progression.
We
discuss
key
structural
features
inactivation
by
mutations.
In
addition,
efforts
have
therapies,
challenges
encountered
clinical
development.
Science,
Journal Year:
2022,
Volume and Issue:
377(6604), P. 387 - 394
Published: July 21, 2022
The
binding
and
catalytic
functions
of
proteins
are
generally
mediated
by
a
small
number
functional
residues
held
in
place
the
overall
protein
structure.
Here,
we
describe
deep
learning
approaches
for
scaffolding
such
sites
without
needing
to
prespecify
fold
or
secondary
structure
scaffold.
first
approach,
"constrained
hallucination,"
optimizes
sequences
that
their
predicted
structures
contain
desired
site.
second
"inpainting,"
starts
from
site
fills
additional
sequence
create
viable
scaffold
single
forward
pass
through
specifically
trained
RoseTTAFold
network.
We
use
these
two
methods
design
candidate
immunogens,
receptor
traps,
metalloproteins,
enzymes,
protein-binding
validate
designs
using
combination
silico
experimental
tests.
Signal Transduction and Targeted Therapy,
Journal Year:
2023,
Volume and Issue:
8(1)
Published: March 14, 2023
Abstract
AlphaFold2
(AF2)
is
an
artificial
intelligence
(AI)
system
developed
by
DeepMind
that
can
predict
three-dimensional
(3D)
structures
of
proteins
from
amino
acid
sequences
with
atomic-level
accuracy.
Protein
structure
prediction
one
the
most
challenging
problems
in
computational
biology
and
chemistry,
has
puzzled
scientists
for
50
years.
The
advent
AF2
presents
unprecedented
progress
protein
attracted
much
attention.
Subsequent
release
more
than
200
million
predicted
further
aroused
great
enthusiasm
science
community,
especially
fields
medicine.
thought
to
have
a
significant
impact
on
structural
research
areas
need
information,
such
as
drug
discovery,
design,
function,
et
al.
Though
time
not
long
since
was
developed,
there
are
already
quite
few
application
studies
medicine,
many
them
having
preliminarily
proved
potential
AF2.
To
better
understand
promote
its
applications,
we
will
this
article
summarize
principle
architecture
well
recipe
success,
particularly
focus
reviewing
applications
Limitations
current
also
be
discussed.
Protein Science,
Journal Year:
2022,
Volume and Issue:
31(8)
Published: July 13, 2022
High-resolution
experimental
structural
determination
of
protein-protein
interactions
has
led
to
valuable
mechanistic
insights,
yet
due
the
massive
number
and
limitations
there
is
a
need
for
computational
methods
that
can
accurately
model
their
structures.
Here
we
explore
use
recently
developed
deep
learning
method,
AlphaFold,
predict
structures
protein
complexes
from
sequence.
With
benchmark
152
diverse
heterodimeric
complexes,
multiple
implementations
parameters
AlphaFold
were
tested
accuracy.
Remarkably,
many
cases
(43%)
had
near-native
models
(medium
or
high
critical
assessment
predicted
accuracy)
generated
as
top-ranked
predictions
by
greatly
surpassing
performance
unbound
docking
(9%
success
rate
models),
however
modeling
antibody-antigen
within
our
set
was
unsuccessful.
We
identified
sequence
features
associated
with
lack
success,
also
investigated
impact
alignment
input.
Benchmarking
multimer-optimized
version
(AlphaFold-Multimer)
released
confirmed
low
(11%
success),
found
T
cell
receptor-antigen
are
likewise
not
modeled
algorithm,
showing
adaptive
immune
recognition
poses
challenge
current
algorithm
model.
Overall,
study
demonstrates
end-to-end
transient
highlights
areas
improvement
future
developments
reliably
any
interaction
interest.
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2022,
Volume and Issue:
unknown
Published: April 10, 2022
Abstract
We
consider
the
problem
of
predicting
a
protein
sequence
from
its
backbone
atom
coordinates.
Machine
learning
approaches
to
this
date
have
been
limited
by
number
available
experimentally
determined
structures.
augment
training
data
nearly
three
orders
magnitude
structures
for
12M
sequences
using
AlphaFold2.
Trained
with
additional
data,
sequence-to-sequence
transformer
invariant
geometric
input
processing
layers
achieves
51%
native
recovery
on
structurally
held-out
backbones
72%
buried
residues,
an
overall
improvement
almost
10
percentage
points
over
existing
methods.
The
model
generalizes
variety
more
complex
tasks
including
design
complexes,
partially
masked
structures,
binding
interfaces,
and
multiple
states.
Science,
Journal Year:
2022,
Volume and Issue:
376(6598)
Published: June 9, 2022
INTRODUCTION
The
eukaryotic
nucleus
pro-tects
the
genome
and
is
enclosed
by
two
membranes
of
nuclear
envelope.
Nuclear
pore
complexes
(NPCs)
perforate
envelope
to
facilitate
nucleocytoplasmic
transport.
With
a
molecular
weight
∼120
MDa,
human
NPC
one
larg-est
protein
complexes.
Its
~1000
proteins
are
taken
in
multiple
copies
from
set
about
30
distinct
nucleoporins
(NUPs).
They
can
be
roughly
categorized
into
classes.
Scaf-fold
NUPs
contain
folded
domains
form
cylindrical
scaffold
architecture
around
central
channel.
Intrinsically
disordered
line
extend
channel,
where
they
interact
with
cargo
highly
dynamic.
It
responds
changes
tension
conforma-tional
breathing
that
manifests
dilation
constriction
movements.
Elucidating
architecture,
ultimately
at
atomic
resolution,
will
important
for
gaining
more
precise
understanding
function
dynamics
but
imposes
substantial
chal-lenge
structural
biologists.
RATIONALE
Considerable
progress
has
been
made
toward
this
goal
joint
effort
field.
A
synergistic
combination
complementary
approaches
turned
out
critical.
In
situ
biology
techniques
were
used
reveal
overall
layout
defines
spatial
reference
modeling.
High-resolution
structures
many
determined
vitro.
Proteomic
analysis
extensive
biochemical
work
unraveled
interaction
network
NUPs.
Integra-tive
modeling
combine
different
types
data,
resulting
rough
outline
scaffold.
Previous
struc-tural
models
NPC,
however,
patchy
limited
accuracy
owing
several
challenges:
(i)
Many
high-resolution
individual
have
solved
distantly
related
species
and,
consequently,
do
not
comprehensively
cover
their
counterparts.
(ii)
scaf-fold
interconnected
intrinsically
linker
straight-forwardly
accessible
common
techniques.
(iii)
intimately
embraces
fused
inner
outer
distinctive
topol-ogy
cannot
studied
isolation.
(iv)
conformational
limits
resolution
achievable
structure
determination.
RESULTS
study,
we
artificial
intelligence
(AI)-based
prediction
generate
an
exten-sive
repertoire
subcomplexes.
various
interfaces
so
far
remained
structurally
uncharac-terized.
Benchmarking
against
previous
unpublished
x-ray
cryo-electron
micros-copy
revealed
unprecedented
accu-racy.
We
obtained
well-resolved
tomographic
maps
both
constricted
dilated
states
hu-man
NPC.
Using
integrative
modeling,
fit-ted
microscopy
maps.
explicitly
included
traced
trajectory
through
scaf-fold.
elucidated
great
detail
how
mem-brane-associated
transmembrane
distributed
across
fusion
topology
membranes.
architectural
model
increases
coverage
twofold.
extensively
validated
our
earlier
new
experimental
data.
completeness
enabled
microsecond-long
coarse-grained
simulations
within
explicit
membrane
en-vironment
solvent.
These
prevents
otherwise
stable
double-membrane
small
diameters
absence
tension.
CONCLUSION
Our
70-MDa
atomically
re-solved
covers
>90%
captures
occur
during
constriction.
also
reveals
anchoring
sites
NUPs,
identification
which
prerequisite
complete
dy-namic
study
exempli-fies
AI-based
may
accelerate
elucidation
subcellular
ar-chitecture
resolution.
[Figure:
see
text].
Nature Communications,
Journal Year:
2022,
Volume and Issue:
13(1)
Published: April 1, 2022
Accurate
descriptions
of
protein-protein
interactions
are
essential
for
understanding
biological
systems.
Remarkably
accurate
atomic
structures
have
been
recently
computed
individual
proteins
by
AlphaFold2
(AF2).
Here,
we
demonstrate
that
the
same
neural
network
models
from
AF2
developed
single
protein
sequences
can
be
adapted
to
predict
multimeric
complexes
without
retraining.
In
contrast
common
approaches,
our
method,
AF2Complex,
does
not
require
paired
multiple
sequence
alignments.
It
achieves
higher
accuracy
than
some
complex
docking
strategies
and
provides
a
significant
improvement
over
AF-Multimer,
development
AlphaFold
proteins.
Moreover,
introduce
metrics
predicting
direct
between
arbitrary
pairs
validate
AF2Complex
on
challenging
benchmark
sets
E.
coli
proteome.
Lastly,
using
cytochrome
c
biogenesis
system
I
as
an
example,
present
high-confidence
three
sought-after
assemblies
formed
eight
members
this
system.