Journal of the American Chemical Society,
Journal Year:
2024,
Volume and Issue:
146(29), P. 19654 - 19659
Published: July 11, 2024
We
evaluate
the
effectiveness
of
pretrained
and
fine-tuned
large
language
models
(LLMs)
for
predicting
synthesizability
inorganic
compounds
selection
precursors
needed
to
perform
synthesis.
The
predictions
LLMs
are
comparable
to─and
sometimes
better
than─recent
bespoke
machine
learning
these
tasks
but
require
only
minimal
user
expertise,
cost,
time
develop.
Therefore,
this
strategy
can
serve
both
as
an
effective
strong
baseline
future
studies
various
chemical
applications
a
practical
tool
experimental
chemists.
ACS Catalysis,
Journal Year:
2023,
Volume and Issue:
13(21), P. 13863 - 13895
Published: Oct. 13, 2023
Recent
progress
in
engineering
highly
promising
biocatalysts
has
increasingly
involved
machine
learning
methods.
These
methods
leverage
existing
experimental
and
simulation
data
to
aid
the
discovery
annotation
of
enzymes,
as
well
suggesting
beneficial
mutations
for
improving
known
targets.
The
field
protein
is
gathering
steam,
driven
by
recent
success
stories
notable
other
areas.
It
already
encompasses
ambitious
tasks
such
understanding
predicting
structure
function,
catalytic
efficiency,
enantioselectivity,
dynamics,
stability,
solubility,
aggregation,
more.
Nonetheless,
still
evolving,
with
many
challenges
overcome
questions
address.
In
this
Perspective,
we
provide
an
overview
ongoing
trends
domain,
highlight
case
studies,
examine
current
limitations
learning-based
We
emphasize
crucial
importance
thorough
validation
emerging
models
before
their
use
rational
design.
present
our
opinions
on
fundamental
problems
outline
potential
directions
future
research.
Journal of Cheminformatics,
Journal Year:
2024,
Volume and Issue:
16(1)
Published: Feb. 21, 2024
REINVENT
4
is
a
modern
open-source
generative
AI
framework
for
the
design
of
small
molecules.
The
software
utilizes
recurrent
neural
networks
and
transformer
architectures
to
drive
molecule
generation.
These
generators
are
seamlessly
embedded
within
general
machine
learning
optimization
algorithms,
transfer
learning,
reinforcement
curriculum
learning.
enables
facilitates
de
novo
design,
R-group
replacement,
library
linker
scaffold
hopping
optimization.
This
contribution
gives
an
overview
describes
its
design.
Algorithms
their
applications
discussed
in
detail.
command
line
tool
which
reads
user
configuration
either
TOML
or
JSON
format.
aim
this
release
provide
reference
implementations
some
most
common
algorithms
based
An
additional
goal
with
create
education
future
innovation
molecular
available
from
https://github.com/MolecularAI/REINVENT4
released
under
permissive
Apache
2.0
license.
Scientific
contribution.
provides
implementation
where
also
being
used
production
support
in-house
drug
discovery
projects.
publication
one
code
full
documentation
thereof
will
increase
transparency
foster
innovation,
collaboration
education.
Journal of the American Chemical Society,
Journal Year:
2023,
Volume and Issue:
145(40), P. 21699 - 21716
Published: Sept. 27, 2023
Exceptional
molecules
and
materials
with
one
or
more
extraordinary
properties
are
both
technologically
valuable
fundamentally
interesting,
because
they
often
involve
new
physical
phenomena
compositions
that
defy
expectations.
Historically,
exceptionality
has
been
achieved
through
serendipity,
but
recently,
machine
learning
(ML)
automated
experimentation
have
widely
proposed
to
accelerate
target
identification
synthesis
planning.
In
this
Perspective,
we
argue
the
data-driven
methods
commonly
used
today
well-suited
for
optimization
not
realization
of
exceptional
molecules.
Finding
such
outliers
should
be
possible
using
ML,
only
by
shifting
away
from
traditional
ML
approaches
tweak
composition,
crystal
structure,
reaction
pathway.
We
highlight
case
studies
high-Tc
oxide
superconductors
superhard
demonstrate
challenges
ML-guided
discovery
discuss
limitations
automation
task.
then
provide
six
recommendations
development
capable
discovery:
(i)
Avoid
tyranny
middle
focus
on
extrema;
(ii)
When
data
limited,
qualitative
predictions
direction
than
interpolative
accuracy;
(iii)
Sample
what
can
made
how
make
it
defer
optimization;
(iv)
Create
room
(and
look)
unexpected
while
pursuing
your
goal;
(v)
Try
fill-in-the-blanks
input
output
space;
(vi)
Do
confuse
human
understanding
model
interpretability.
conclude
a
description
these
integrated
into
workflows,
which
enable
materials.
Nature Machine Intelligence,
Journal Year:
2024,
Volume and Issue:
6(4), P. 437 - 448
Published: March 29, 2024
Abstract
Generative
machine
learning
models
have
attracted
intense
interest
for
their
ability
to
sample
novel
molecules
with
desired
chemical
or
biological
properties.
Among
these,
language
trained
on
SMILES
(Simplified
Molecular-Input
Line-Entry
System)
representations
been
subject
the
most
extensive
experimental
validation
and
widely
adopted.
However,
these
what
is
perceived
be
a
major
limitation:
some
fraction
of
strings
that
they
generate
are
invalid,
meaning
cannot
decoded
structure.
This
shortcoming
has
motivated
remarkably
broad
spectrum
work
designed
mitigate
generation
invalid
correct
them
post
hoc.
Here
I
provide
causal
evidence
produce
outputs
not
harmful
but
instead
beneficial
models.
show
provides
self-corrective
mechanism
filters
low-likelihood
samples
from
model
output.
Conversely,
enforcing
valid
produces
structural
biases
in
generated
molecules,
impairing
distribution
limiting
generalization
unseen
space.
Together,
results
refute
prevailing
assumption
reframe
as
feature,
bug.
Advanced Energy Materials,
Journal Year:
2024,
Volume and Issue:
unknown
Published: Dec. 10, 2024
Abstract
This
review
highlights
recent
advances
in
machine
learning
(ML)‐assisted
design
of
energy
materials.
Initially,
ML
algorithms
were
successfully
applied
to
screen
materials
databases
by
establishing
complex
relationships
between
atomic
structures
and
their
resulting
properties,
thus
accelerating
the
identification
candidates
with
desirable
properties.
Recently,
development
highly
accurate
interatomic
potentials
generative
models
has
not
only
improved
robust
prediction
physical
but
also
significantly
accelerated
discovery
In
past
couple
years,
methods
have
enabled
high‐precision
first‐principles
predictions
electronic
optical
properties
for
large
systems,
providing
unprecedented
opportunities
science.
Furthermore,
ML‐assisted
microstructure
reconstruction
physics‐informed
solutions
partial
differential
equations
facilitated
understanding
microstructure–property
relationships.
Most
recently,
seamless
integration
various
platforms
led
emergence
autonomous
laboratories
that
combine
quantum
mechanical
calculations,
language
models,
experimental
validations,
fundamentally
transforming
traditional
approach
novel
synthesis.
While
highlighting
aforementioned
advances,
existing
challenges
are
discussed.
Ultimately,
is
expected
fully
integrate
atomic‐scale
simulations,
reverse
engineering,
process
optimization,
device
fabrication,
empowering
system
design.
will
drive
transformative
innovations
conversion,
storage,
harvesting
technologies.