Journal of Advanced Research,
Год журнала:
2025,
Номер
unknown
Опубликована: Янв. 1, 2025
Antimicrobial
peptides
(AMPs)
present
a
promising
avenue
to
combat
the
growing
threat
of
antibiotic
resistance.
The
ruminant
gastrointestinal
microbiome
serves
as
unique
ecosystem
that
offers
untapped
potential
for
AMP
discovery.
ACS Catalysis,
Год журнала:
2023,
Номер
13(21), С. 13863 - 13895
Опубликована: Окт. 13, 2023
Recent
progress
in
engineering
highly
promising
biocatalysts
has
increasingly
involved
machine
learning
methods.
These
methods
leverage
existing
experimental
and
simulation
data
to
aid
the
discovery
annotation
of
enzymes,
as
well
suggesting
beneficial
mutations
for
improving
known
targets.
The
field
protein
is
gathering
steam,
driven
by
recent
success
stories
notable
other
areas.
It
already
encompasses
ambitious
tasks
such
understanding
predicting
structure
function,
catalytic
efficiency,
enantioselectivity,
dynamics,
stability,
solubility,
aggregation,
more.
Nonetheless,
still
evolving,
with
many
challenges
overcome
questions
address.
In
this
Perspective,
we
provide
an
overview
ongoing
trends
domain,
highlight
case
studies,
examine
current
limitations
learning-based
We
emphasize
crucial
importance
thorough
validation
emerging
models
before
their
use
rational
design.
present
our
opinions
on
fundamental
problems
outline
potential
directions
future
research.
Cells
have
evolved
mechanisms
to
distribute
~10
billion
protein
molecules
subcellular
compartments
where
diverse
proteins
involved
in
shared
functions
must
assemble.
Here,
we
demonstrate
that
with
share
amino
acid
sequence
codes
guide
them
compartment
destinations.
A
language
model,
ProtGPS,
was
developed
predicts
high
performance
the
localization
of
human
excluded
from
training
set.
ProtGPS
successfully
guided
generation
novel
sequences
selectively
assemble
nucleolus.
identified
pathological
mutations
change
this
code
and
lead
altered
proteins.
Our
results
indicate
contain
not
only
a
folding
code,
but
also
previously
unrecognized
governing
their
distribution
compartments.
Abstract
In
recent
years,
the
rapid
growth
of
biological
data
has
increased
interest
in
using
bioinformatics
to
analyze
and
interpret
this
data.
Proteomics,
which
studies
structure,
function,
interactions
proteins,
is
a
crucial
area
bioinformatics.
Using
natural
language
processing
(NLP)
techniques
proteomics
an
emerging
field
that
combines
machine
learning
text
mining
Recently,
transformer‐based
NLP
models
have
gained
significant
attention
for
their
ability
process
variable‐length
input
sequences
parallel,
self‐attention
mechanisms
capture
long‐range
dependencies.
review
paper,
we
discuss
advancements
proteome
examine
advantages,
limitations,
potential
applications
improve
accuracy
efficiency
various
tasks.
Additionally,
highlight
challenges
future
directions
these
research.
Overall,
provides
valuable
insights
into
revolutionize
Briefings in Bioinformatics,
Год журнала:
2023,
Номер
25(1)
Опубликована: Ноя. 22, 2023
Abstract
Within
drug
discovery,
the
goal
of
AI
scientists
and
cheminformaticians
is
to
help
identify
molecular
starting
points
that
will
develop
into
safe
efficacious
drugs
while
reducing
costs,
time
failure
rates.
To
achieve
this
goal,
it
crucial
represent
molecules
in
a
digital
format
makes
them
machine-readable
facilitates
accurate
prediction
properties
drive
decision-making.
Over
years,
representations
have
evolved
from
intuitive
human-readable
formats
bespoke
numerical
descriptors
fingerprints,
now
learned
capture
patterns
salient
features
across
vast
chemical
spaces.
Among
these,
sequence-based
graph-based
small
become
highly
popular.
However,
each
approach
has
strengths
weaknesses
dimensions
such
as
generality,
computational
cost,
inversibility
for
generative
applications
interpretability,
which
can
be
critical
informing
practitioners’
decisions.
As
discovery
landscape
evolves,
opportunities
innovation
continue
emerge.
These
include
creation
high-value,
low-data
regimes,
distillation
broader
biological
knowledge
novel
modeling
up-and-coming
therapeutic
modalities.
bioRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2024,
Номер
unknown
Опубликована: Март 17, 2024
Protein
language
models
trained
on
evolutionary
data
have
emerged
as
powerful
tools
for
predictive
problems
involving
protein
sequence,
structure,
and
function.
However,
these
overlook
decades
of
research
into
biophysical
factors
governing
We
propose
Mutational
Effect
Transfer
Learning
(METL),
a
model
framework
that
unites
advanced
machine
learning
modeling.
Using
the
METL
framework,
we
pretrain
transformer-based
neural
networks
simulation
to
capture
fundamental
relationships
between
energetics.
finetune
experimental
sequence-function
harness
signals
apply
them
when
predicting
properties
like
thermostability,
catalytic
activity,
fluorescence.
excels
in
challenging
engineering
tasks
generalizing
from
small
training
sets
position
extrapolation,
although
existing
methods
train
remain
many
types
assays.
demonstrate
METL's
ability
design
functional
green
fluorescent
variants
only
64
examples,
showcasing
potential
biophysics-based
engineering.
Energy,
Год журнала:
2024,
Номер
307, С. 132636 - 132636
Опубликована: Июль 29, 2024
Due
to
the
recent
advancements
in
Internet
of
Things
and
data
science
techniques,
a
wide
range
studies
have
investigated
use
mining
(DM)
machine
learning
(ML)
algorithms
enhance
building
energy
management
(BEM).
However,
different
classes
DM
ML
feature
mechanisms
capabilities,
resulting
their
distinct
roles
performance
BEM.
Appropriate
integration
categories
BEM
is
essential
promote
application
provide
guidance
for
new
topic
areas.
This
study
presents
literature
review
techniques
key
areas
BEM,
including
evaluation,
usage
prediction,
demand
flexibility
optimization.
The
categorizes
into
three
main
categories,
supervised
DM,
unsupervised
reinforcement
(RL).
Unsupervised
are
primarily
used
assessment,
while
mainly
employed
benchmarking
prediction.
RL
has
been
utilized
optimal
control
improve
efficiency,
flexibility,
indoor
thermal
comfort.
strengths,
shortcomings,
these
methods
terms
applications
discussed,
along
with
some
suggestions
future
research
this
field.
Journal of Cheminformatics,
Год журнала:
2024,
Номер
16(1)
Опубликована: Март 14, 2024
Protein-ligand
binding
site
prediction
is
a
useful
tool
for
understanding
the
functional
behaviour
and
potential
drug-target
interactions
of
novel
protein
interest.
However,
most
methods
are
tested
by
providing
crystallised
ligand-bound
(holo)
structures
as
input.
This
testing
regime
insufficient
to
understand
performance
on
targets
where
experimental
not
available.
An
alternative
option
provide
computationally
predicted
structures,
but
this
commonly
tested.
due
training
data
used,
computationally-predicted
tend
be
extremely
accurate,
often
biased
toward
holo
conformation.
In
study
we
describe
benchmark
IF-SitePred,
protein-ligand
method
which
based
labelling
ESM-IF1
language
model
embeddings
combined
with
point
cloud
annotation
clustering.
We
show
that
only
IF-SitePred
competitive
state-of-the-art
when
predicting
sites
it
performs
better
proxies
proteins
low
accuracy
has
been
simulated
molecular
dynamics.
Finally,
outperforms
other
if
ensembles
generated.
Journal of Chemical Information and Modeling,
Год журнала:
2025,
Номер
unknown
Опубликована: Фев. 2, 2025
Machine
learning
(ML)
models
have
become
increasingly
popular
for
predicting
and
designing
structures
properties
of
peptides
proteins.
These
ML
typically
use
proteins
containing
only
canonical
amino
acids
as
the
training
data.
Consequently,
these
struggle
to
make
accurate
predictions
new
that
are
absent
in
data
set
(e.g.,
noncanonical
acids).
One
approach
improve
accuracy
is
collect
more
with
desired
acids.
However,
this
strategy
suboptimal
may
not
be
easily
attainable,
additional
time
required
retrain
models.
Alternatively,
extendibility
can
improved
if
acid
features
used
representative
generalizable
unseen
Herein,
we
develop
using
molecular
dynamics
(MD)
simulation
results.
Specifically,
a
given
acid,
perform
MD
its
dipeptide
create
based
on
backbone
(ϕ,
ψ)
distributions
electrostatic
potentials.
We
demonstrate
enable
our
accurately
predict
structural
ensembles
cyclic
present
original
set.
For
example,
build
pentapeptide
structures,
library
15
test
same
15-amino-acid
or
an
extended
50-amino-acid
library.
When
such
Morgan
fingerprints
MACCS
keys
represent
acids,
achieve
R2
=
0.963
pentapeptides
models'
performances
decrease
significantly
0.430
0.508,
respectively,
when
tasked
50
On
other
hand,
model
outperforms
those
keys,
0.700.
Overall,
instead
having
data,
peptide
sequences
originally
at
mere
cost
performing
simulations