The
advent
of
machine
learning
(ML)
in
computational
chemistry
heralds
a
transformative
approach
to
one
the
quintessential
challenges
computer-aided
drug
design
(CADD):
accurate
and
cost-effective
calculation
atomic
interactions.
By
leveraging
neural
network
(NN)
potential,
we
address
this
balance
push
boundaries
NN
potential's
representational
capacity.
Our
work
details
development
robust
general-purpose
architected
on
framework
DPA-2,
deep
potential
with
attention,
which
demonstrates
remarkable
fidelity
replicating
interatomic
energy
surface
for
drug-like
molecules
comprising
eight
critical
chemical
elements:
H,
C,
N,
O,
F,
S,
Cl,
P.
We
employed
state-of-the-art
molecular
dynamic
techniques,
including
temperature
acceleration
enhanced
sampling,
construct
comprehensive
dataset
ensure
exhaustive
coverage
relevant
configurational
spaces.
rigorous
testing
protocols,
torsion
scanning,
global
minimum
searches,
high-temperature
MD
simulations
across
various
organic
molecules,
have
culminated
an
model
that
achieves
precision
commensurate
highly
regarded
DFT
model,
while
significantly
outstripping
accuracy
prevalent
semi-empirical
methods.
This
study
presents
leap
forward
predictive
modelling
interactions,
offering
extensive
applications
beyond.
Artificial Intelligence Review,
Journal Year:
2024,
Volume and Issue:
57(4)
Published: March 29, 2024
Abstract
Molecular
dynamics
(MD)
simulations
are
a
key
computational
chemistry
technique
that
provide
dynamic
insight
into
the
underlying
atomic-level
processes
in
system
under
study.
These
insights
not
only
improve
our
understanding
of
molecular
world,
but
also
aid
design
experiments
and
targeted
interventions.
Currently,
MD
is
associated
with
several
limitations,
most
important
which
are:
insufficient
sampling,
inadequate
accuracy
atomistic
models,
challenges
proper
analysis
interpretation
obtained
trajectories.
Although
numerous
efforts
have
been
made
to
address
these
more
effective
solutions
still
needed.
The
recent
development
artificial
intelligence,
particularly
machine
learning
(ML),
offers
exciting
opportunities
MD.
In
this
review
we
aim
familiarize
readers
basics
while
highlighting
its
limitations.
main
focus
on
exploring
integration
deep
simulations.
advancements
by
ML
systematically
outlined,
including
ML-based
force
fields,
techniques
for
improved
conformational
space
innovative
methods
trajectory
analysis.
Additionally,
implications
intelligence
discussed.
While
potential
ML-MD
fusion
clearly
established,
further
applications
needed
confirm
superiority
over
traditional
methods.
This
comprehensive
overview
new
perspectives
MD,
has
opened
up,
serves
as
gentle
introduction
phase
development.
Machine
learned
interatomic
potentials
(MLIPs)
are
reshaping
computational
chemistry
practices
because
of
their
ability
to
drastically
exceed
the
accuracy-length/time
scale
tradeoff.
Despite
this
attraction,
benefits
such
efficiency
only
impactful
when
an
MLIP
uniquely
enables
insight
into
a
target
system
or
is
broadly
transferable
outside
training
dataset,
where
models
achieving
latter
seldom
reported.
In
work,
we
present
2nd
generation
our
atoms-in-molecules
neural
network
potential
(AIMNet2),
which
applicable
species
composed
up
14
chemical
elements
in
both
neutral
and
charged
states,
making
it
valuable
model
for
modeling
majority
non-metallic
compounds.
Using
exhaustive
dataset
20
million
hybrid
quantum
calculations,
AIMNet2
combines
ML-parameterized
short-range
physics-based
long-range
terms
attain
generalizability
that
reaches
from
simple
organics
diverse
molecules
with
“exotic”
element-organic
bonding.
We
show
outperforms
semi-empirical
GFN-xTB
on
par
reference
density
functional
theory
interaction
energy
contributions,
conformer
search
tasks,
torsion
rotation
profiles,
molecular-to-macromolecular
geometry
optimization.
Overall,
demonstrated
coverage
significant
step
toward
providing
access
MLIPs
avoid
crucial
limitation
curating
additional
data
retraining
each
new
application.
Chemical Reviews,
Journal Year:
2024,
Volume and Issue:
124(24), P. 13681 - 13714
Published: Nov. 21, 2024
The
field
of
data-driven
chemistry
is
undergoing
an
evolution,
driven
by
innovations
in
machine
learning
models
for
predicting
molecular
properties
and
behavior.
Recent
strides
ML-based
interatomic
potentials
have
paved
the
way
accurate
modeling
diverse
chemical
structural
at
atomic
level.
key
determinant
defining
MLIP
reliability
remains
quality
training
data.
A
paramount
challenge
lies
constructing
sets
that
capture
specific
domains
vast
space.
This
Review
navigates
intricate
landscape
essential
components
integrity
data
ensure
extensibility
transferability
resulting
models.
We
delve
into
details
active
learning,
discussing
its
various
facets
implementations.
outline
different
types
uncertainty
quantification
applied
to
atomistic
acquisition
correlations
between
estimated
true
error.
role
samplers
generating
informative
structures
highlighted.
Furthermore,
we
discuss
via
modified
surrogate
potential
energy
surfaces
as
innovative
approach
diversify
also
provides
a
list
publicly
available
cover
ACS Physical Chemistry Au,
Journal Year:
2024,
Volume and Issue:
4(3), P. 232 - 241
Published: March 21, 2024
In
the
next
half-century,
physical
chemistry
will
likely
undergo
a
profound
transformation,
driven
predominantly
by
combination
of
recent
advances
in
quantum
and
machine
learning
(ML).
Specifically,
equivariant
neural
network
potentials
(NNPs)
are
breakthrough
new
tool
that
already
enabling
us
to
simulate
systems
at
molecular
scale
with
unprecedented
accuracy
speed,
relying
on
nothing
but
fundamental
laws.
The
continued
development
this
approach
realize
Paul
Dirac's
80-year-old
vision
using
mechanics
unify
physics
providing
invaluable
tools
for
understanding
materials
science,
biology,
earth
sciences,
beyond.
era
highly
accurate
efficient
first-principles
simulations
provide
wealth
training
data
can
be
used
build
automated
computational
methodologies,
such
as
diffusion
models,
design
optimization
scale.
Large
language
models
(LLMs)
also
evolve
into
increasingly
indispensable
literature
review,
coding,
idea
generation,
scientific
writing.
Journal of Chemical Theory and Computation,
Journal Year:
2025,
Volume and Issue:
unknown
Published: Jan. 10, 2025
While
machine
learning
(ML)
models
have
been
able
to
achieve
unprecedented
accuracies
across
various
prediction
tasks
in
quantum
chemistry,
it
is
now
apparent
that
accuracy
on
a
test
set
alone
not
guarantee
for
robust
chemical
modeling
such
as
stable
molecular
dynamics
(MD).
To
go
beyond
accuracy,
we
use
explainable
artificial
intelligence
(XAI)
techniques
develop
general
analysis
framework
atomic
interactions
and
apply
the
SchNet
PaiNN
neural
network
models.
We
compare
these
with
of
fundamental
principles
understand
how
well
learned
underlying
physicochemical
concepts
from
data.
focus
strength
different
species,
predictions
intensive
extensive
properties
are
made,
analyze
decay
many-body
nature
interatomic
distance.
Models
deviate
too
far
known
physical
produce
unstable
MD
trajectories,
even
when
they
very
high
energy
force
accuracy.
also
suggest
further
improvements
ML
architectures
better
account
polynomial
interactions.
Journal of Chemical Theory and Computation,
Journal Year:
2024,
Volume and Issue:
unknown
Published: Aug. 16, 2024
Machine
learning
potentials
(MLPs)
have
revolutionized
the
field
of
atomistic
simulations
by
describing
atomic
interactions
with
accuracy
electronic
structure
methods
at
a
small
fraction
cost.
Most
current
MLPs
construct
energy
system
as
sum
energies,
which
depend
on
information
about
environments
provided
in
form
predefined
or
learnable
feature
vectors.
If,
addition,
nonlocal
phenomena
like
long-range
charge
transfer
are
important,
fourth-generation
need
to
be
used,
include
equilibration
(Qeq)
step
take
global
into
account.
This
Qeq
can
significantly
increase
computational
cost
and
thus
become
bottleneck
for
large
systems.
In
this
Article,
we
present
highly
efficient
formulation
that
does
not
require
explicit
computation
Coulomb
matrix
elements,
resulting
quasi-linear
scaling
method.
Moreover,
our
approach
also
allows
calculation
derivatives,
explicitly
consider
structure-dependence
charges
obtained
from
Qeq.
Due
its
generality,
method
is
restricted
applied
within
variety
other
force
fields.
Journal of Chemical Information and Modeling,
Journal Year:
2024,
Volume and Issue:
64(3), P. 749 - 760
Published: Jan. 11, 2024
Accurately
determining
the
global
minima
of
a
molecular
structure
is
important
in
diverse
scientific
fields,
including
drug
design,
materials
science,
and
chemical
synthesis.
Conformational
search
engines
serve
as
valuable
tools
for
exploring
extensive
conformational
space
molecules
identifying
energetically
favorable
conformations.
In
this
study,
we
present
comparison
Auto3D,
CREST,
Balloon,
ETKDG
(from
RDKit),
which
are
freely
available
engines,
to
evaluate
their
effectiveness
locating
minima.
These
employ
distinct
methodologies,
machine
learning
(ML)
potential-based,
semiempirical,
force
field-based
approaches.
To
validate
these
methods,
propose
use
collisional
cross-section
(CCS)
values
obtained
from
ion
mobility–mass
spectrometry
studies.
We
hypothesize
that
experimental
gas-phase
CCS
can
provide
evidence
likely
have
minimum
given
molecule.
facilitate
effort,
used
our
conformation
library
(GPCL)
currently
consists
full
ensembles
20
small
be
by
community
any
engine.
Further
members
GPCL
readily
created
molecule
interest
using
standard
workflow
compute
values,
expanding
ability
validation
exercises.
innovative
techniques
enhance
understanding
landscape
insights
into
performance
generation
engines.
Our
findings
shed
light
on
strengths
limitations
each
engine,
enabling
informed
decisions
utilization
various
where
accurate
determination
crucial
biological
activity
designing
targeted
interventions.
By
facilitating
identification
reliable
conformations,
study
significantly
contributes
enhancing
efficiency
accuracy
determination,
with
particular
focus
metabolite
elucidation.
The
research
also
developing
effective
workflows
predicting
structures
unknown
compounds
high
precision.
The Journal of Physical Chemistry A,
Journal Year:
2024,
Volume and Issue:
128(13), P. 2543 - 2555
Published: March 22, 2024
Activation
energy
characterization
of
competing
reactions
is
a
costly
but
crucial
step
for
understanding
the
kinetic
relevance
distinct
reaction
pathways,
product
yields,
and
myriad
other
properties
reacting
systems.
The
standard
methodology
activation
has
historically
been
transition
state
search
using
highest
level
theory
that
can
be
afforded.
However,
recently,
several
groups
have
popularized
idea
predicting
energies
directly
based
on
nothing
more
than
reactant
graphs,
sufficiently
complex
neural
network,
broad
enough
data
set.
Here,
we
revisited
this
task
recently
developed
Reaction
Graph
Depth
1
(RGD1)
set
newly
graph
attention
architectures.
All
these
new
architectures
achieve
similar
state-of-the-art
results
∼4
kcal/mol
mean
absolute
error
withheld
testing
sets
poor
performance
external
composed
with
differing
mechanisms,
molecularity,
or
size
distribution.
Limited
transferability
also
shown
to
shared
by
contemporary
through
series
case
studies.
We
conclude
an
array
already
comparable
irreducible
available
out-of-distribution
remains
poor.
Journal of Chemical Theory and Computation,
Journal Year:
2024,
Volume and Issue:
20(15), P. 6946 - 6956
Published: June 4, 2024
Accurate
prediction
of
micro-pKa
values
is
crucial
for
understanding
and
modulating
the
acidity
basicity
organic
molecules,
with
applications
in
drug
discovery,
materials
science,
environmental
chemistry.
This
work
introduces
QupKake,
a
novel
method
that
combines
graph
neural
network
models
semiempirical
quantum
mechanical
(QM)
features
to
achieve
exceptional
accuracy
generalization
prediction.
QupKake
outperforms
state-of-the-art
on
variety
benchmark
data
sets,
root-mean-square
errors
between
0.5
0.8
pKa
units
five
external
test
sets.
Feature
importance
analysis
reveals
role
QM
both
reaction
site
enumeration
models.
represents
significant
advancement
prediction,
offering
powerful
tool
various
chemistry
beyond.
The Journal of Physical Chemistry B,
Journal Year:
2024,
Volume and Issue:
128(15), P. 3662 - 3676
Published: April 3, 2024
The
field
of
machine
learning
potentials
has
experienced
a
rapid
surge
in
progress,
thanks
to
advances
theory,
algorithms,
and
hardware
capabilities.
While
the
underlying
methods
are
continuously
evolving,
infrastructure
for
their
deployment
lagged.
community,
due
these
developments,
frequently
finds
itself
split
into
groups
built
around
different
implementations
machine-learned
potentials.
In
this
work,
we
introduce