Although
cytochrome
P450
enzymes
are
the
most
versatile
biocatalysts
in
nature,
there
is
insufficient
comprehension
of
molecular
mechanism
underlying
their
functional
innovation
process.
Here,
by
combining
ancestral
sequence
reconstruction,
reverse
mutation
assay,
and
progressive
forward
accumulation,
we
identified
5
founder
residues
catalytic
pocket
flavone
6-hydroxylase
(F6H)
proposed
a
“3-point
fixation”
model
to
elucidate
mechanisms
P450s
nature.
According
this
design
principle
pocket,
further
developed
de
novo
diffusion
(P450Diffusion)
generate
artificial
P450s.
Ultimately,
among
17
non-natural
generated,
10
designs
exhibited
significant
F6H
activity
6
1.3-
3.5-fold
increase
capacity
compared
natural
CYP706X1.
This
work
not
only
explores
pockets
P450s,
but
also
provides
an
insight
into
with
desired
functions.
ACS Catalysis,
Journal Year:
2023,
Volume and Issue:
13(21), P. 13863 - 13895
Published: Oct. 13, 2023
Recent
progress
in
engineering
highly
promising
biocatalysts
has
increasingly
involved
machine
learning
methods.
These
methods
leverage
existing
experimental
and
simulation
data
to
aid
the
discovery
annotation
of
enzymes,
as
well
suggesting
beneficial
mutations
for
improving
known
targets.
The
field
protein
is
gathering
steam,
driven
by
recent
success
stories
notable
other
areas.
It
already
encompasses
ambitious
tasks
such
understanding
predicting
structure
function,
catalytic
efficiency,
enantioselectivity,
dynamics,
stability,
solubility,
aggregation,
more.
Nonetheless,
still
evolving,
with
many
challenges
overcome
questions
address.
In
this
Perspective,
we
provide
an
overview
ongoing
trends
domain,
highlight
case
studies,
examine
current
limitations
learning-based
We
emphasize
crucial
importance
thorough
validation
emerging
models
before
their
use
rational
design.
present
our
opinions
on
fundamental
problems
outline
potential
directions
future
research.
Cell,
Journal Year:
2024,
Volume and Issue:
187(3), P. 526 - 544
Published: Feb. 1, 2024
Methods
from
artificial
intelligence
(AI)
trained
on
large
datasets
of
sequences
and
structures
can
now
"write"
proteins
with
new
shapes
molecular
functions
de
novo,
without
starting
found
in
nature.
In
this
Perspective,
I
will
discuss
the
state
field
novo
protein
design
at
juncture
physics-based
modeling
approaches
AI.
New
folds
higher-order
assemblies
be
designed
considerable
experimental
success
rates,
difficult
problems
requiring
tunable
control
over
conformations
precise
shape
complementarity
for
recognition
are
coming
into
reach.
Emerging
incorporate
engineering
principles-tunability,
controllability,
modularity-into
process
beginning.
Exciting
frontiers
lie
deconstructing
cellular
and,
conversely,
constructing
synthetic
signaling
ground
up.
As
methods
improve,
many
more
challenges
unsolved.
ACS Central Science,
Journal Year:
2024,
Volume and Issue:
10(2), P. 226 - 241
Published: Feb. 5, 2024
Enzymes
can
be
engineered
at
the
level
of
their
amino
acid
sequences
to
optimize
key
properties
such
as
expression,
stability,
substrate
range,
and
catalytic
efficiency-or
even
unlock
new
activities
not
found
in
nature.
Because
search
space
possible
proteins
is
vast,
enzyme
engineering
usually
involves
discovering
an
starting
point
that
has
some
desired
activity
followed
by
directed
evolution
improve
its
"fitness"
for
a
application.
Recently,
machine
learning
(ML)
emerged
powerful
tool
complement
this
empirical
process.
ML
models
contribute
(1)
discovery
functional
annotation
known
protein
or
generating
novel
with
functions
(2)
navigating
fitness
landscapes
optimization
mappings
between
associated
values.
In
Outlook,
we
explain
how
complements
discuss
future
potential
improved
outcomes.
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2023,
Volume and Issue:
unknown
Published: July 25, 2023
Abstract
Adapting
large
language
models
(LLMs)
to
protein
sequences
spawned
the
development
of
powerful
(pLMs).
Concurrently,
AlphaFold2
broke
through
in
structure
prediction.
Now
we
can
systematically
and
comprehensively
explore
dual
nature
proteins
that
act
exist
as
three-dimensional
(3D)
machines
evolve
linear
strings
one-dimensional
(1D)
sequences.
Here,
leverage
pLMs
simultaneously
model
both
modalities
by
combining
1D
with
3D
a
single
model.
We
encode
structures
token
using
3Di-alphabet
introduced
3D-alignment
method
Foldseek
.
This
new
foundation
pLM
extracts
features
patterns
resulting
“structure-sequence”
representation.
Toward
this
end,
built
non-redundant
dataset
from
AlphaFoldDB
fine-tuned
an
existing
(ProtT5)
translate
between
3Di
amino
acid
As
proof-of-concept
for
our
novel
approach,
dubbed
Protein
structure-sequence
T5
(
ProstT5
),
showed
improved
performance
subsequent
prediction
tasks,
“inverse
folding”,
namely
generation
adopting
given
structural
scaffold
(“fold”).
Our
work
showcased
potential
tap
into
information-rich
revolution
fueled
AlphaFold2.
paves
way
develop
tools
integrating
vast
resource
predictions,
opens
research
avenues
post-AlphaFold2
era.
is
freely
available
all
at
https://github.com/mheinzinger/ProstT5
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2023,
Volume and Issue:
unknown
Published: Sept. 12, 2023
Abstract
Deep
generative
models
are
increasingly
powerful
tools
for
the
in
silico
design
of
novel
proteins.
Recently,
a
family
called
diffusion
has
demonstrated
ability
to
generate
biologically
plausible
proteins
that
dissimilar
any
actual
seen
nature,
enabling
unprecedented
capability
and
control
de
novo
protein
design.
However,
current
state-of-the-art
structures,
which
limits
scope
their
training
data
restricts
generations
small
biased
subset
space.
Here,
we
introduce
general-purpose
framework,
EvoDiff,
combines
evolutionary-scale
with
distinct
conditioning
capabilities
controllable
generation
sequence
EvoDiff
generates
high-fidelity,
diverse,
structurally-plausible
cover
natural
functional
We
show
experimentally
express,
fold,
exhibit
expected
secondary
structure
elements.
Critically,
can
inaccessible
structure-based
models,
such
as
those
disordered
regions,
while
maintaining
scaffolds
structural
motifs.
validate
universality
our
sequence-based
formulation
by
characterizing
intrinsically-disordered
mitochondrial
targeting
signals,
metal-binding
proteins,
binders
designed
using
EvoDiff.
envision
will
expand
engineering
beyond
structure-function
paradigm
toward
programmable,
sequence-first
Nature Biotechnology,
Journal Year:
2024,
Volume and Issue:
unknown
Published: April 23, 2024
In
recent
years,
generative
protein
sequence
models
have
been
developed
to
sample
novel
sequences.
However,
predicting
whether
generated
proteins
will
fold
and
function
remains
challenging.
We
evaluate
a
set
of
20
diverse
computational
metrics
assess
the
quality
enzyme
sequences
produced
by
three
contrasting
models:
ancestral
reconstruction,
adversarial
network
language
model.
Focusing
on
two
families,
we
expressed
purified
over
500
natural
with
70-90%
identity
most
similar
benchmark
for
in
vitro
activity.
Over
rounds
experiments,
filter
that
improved
rate
experimental
success
50-150%.
The
proposed
drive
engineering
research
serving
as
helping
select
active
variants
testing.
NAR Genomics and Bioinformatics,
Journal Year:
2024,
Volume and Issue:
6(4)
Published: Sept. 28, 2024
Adapting
language
models
to
protein
sequences
spawned
the
development
of
powerful
(pLMs).
Concurrently,
AlphaFold2
broke
through
in
structure
prediction.
Now
we
can
systematically
and
comprehensively
explore
dual
nature
proteins
that
act
exist
as
three-dimensional
(3D)
machines
evolve
linear
strings
one-dimensional
(1D)
sequences.
Here,
leverage
pLMs
simultaneously
model
both
modalities
a
single
model.
We
encode
structures
token
using
3Di-alphabet
introduced
by
3D-alignment
method
Chemosphere,
Journal Year:
2024,
Volume and Issue:
355, P. 141749 - 141749
Published: March 21, 2024
Plastic
pollution
has
become
a
major
global
concern,
posing
numerous
challenges
for
the
environment
and
wildlife.
Most
conventional
ways
of
plastics
degradation
are
inefficient
cause
great
damage
to
ecosystems.
The
development
biodegradable
offers
promising
solution
waste
management.
These
designed
break
down
under
various
conditions,
opening
up
new
possibilities
mitigate
negative
impact
traditional
plastics.
Microbes,
including
bacteria
fungi,
play
crucial
role
in
bioplastics
by
producing
secreting
extracellular
enzymes,
such
as
cutinase,
lipases,
proteases.
However,
these
microbial
enzymes
sensitive
extreme
environmental
temperature
acidity,
affecting
their
functions
stability.
To
address
challenges,
scientists
have
employed
protein
engineering
immobilization
techniques
enhance
enzyme
stability
predict
structures.
Strategies
improving
substrate
interaction,
increasing
thermostability,
reinforcing
bonding
between
active
site
substrate,
refining
activity
being
utilized
boost
functionality.
Recently,
bioengineering
through
gene
cloning
expression
potential
microorganisms,
revolutionized
biodegradation
bioplastics.
This
review
aimed
discuss
most
recent
strategies
modifying
bioplastic-degrading
terms
functionality,
thermostability
enhancement,
binding
site,
with
other
improvement
surface
action.
Additionally,
discovered
exoenzymes
metagenomics
were
emphasized.
ACS Catalysis,
Journal Year:
2023,
Volume and Issue:
13(21), P. 14454 - 14469
Published: Oct. 26, 2023
Emerging
computational
tools
promise
to
revolutionize
protein
engineering
for
biocatalytic
applications
and
accelerate
the
development
timelines
previously
needed
optimize
an
enzyme
its
more
efficient
variant.
For
over
a
decade,
benefits
of
predictive
algorithms
have
helped
scientists
engineers
navigate
complexity
functional
sequence
space.
More
recently,
spurred
by
dramatic
advances
in
underlying
tools,
faster,
cheaper,
accurate
identification,
characterization,
has
catapulted
terms
such
as
artificial
intelligence
machine
learning
must-have
vocabulary
field.
This
Perspective
aims
showcase
current
status
pharmaceutical
industry
also
discuss
celebrate
innovative
approaches
science
highlighting
their
potential
selected
recent
developments
offering
thoughts
on
future
opportunities
biocatalysis.
It
critically
assesses
technology's
limitations,
unanswered
questions,
unmet
challenges.
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2023,
Volume and Issue:
unknown
Published: March 4, 2023
Abstract
In
recent
years,
generative
protein
sequence
models
have
been
developed
to
sample
novel
sequences.
However,
predicting
whether
generated
proteins
will
fold
and
function
remains
challenging.
We
evaluate
computational
metrics
assess
the
quality
of
enzyme
sequences
produced
by
three
contrasting
models:
ancestral
reconstruction,
a
adversarial
network,
language
model.
Focusing
on
two
families,
we
expressed
purified
over
440
natural
with
70-90%
identity
most
similar
benchmark
for
in
vitro
activity.
Over
rounds
experiments,
filter
that
improved
experimental
success
rates
44-100%.
Surprisingly,
neither
nor
AlphaFold2
residue-confidence
scores
were
predictive
The
proposed
drive
engineering
research
serving
as
helping
select
active
variants
test
experimentally.