Journal of the American Chemical Society,
Год журнала:
2024,
Номер
146(29), С. 19654 - 19659
Опубликована: Июль 11, 2024
We
evaluate
the
effectiveness
of
pretrained
and
fine-tuned
large
language
models
(LLMs)
for
predicting
synthesizability
inorganic
compounds
selection
precursors
needed
to
perform
synthesis.
The
predictions
LLMs
are
comparable
to─and
sometimes
better
than─recent
bespoke
machine
learning
these
tasks
but
require
only
minimal
user
expertise,
cost,
time
develop.
Therefore,
this
strategy
can
serve
both
as
an
effective
strong
baseline
future
studies
various
chemical
applications
a
practical
tool
experimental
chemists.
Nature,
Год журнала:
2023,
Номер
616(7958), С. 673 - 685
Опубликована: Апрель 26, 2023
Computer-aided
drug
discovery
has
been
around
for
decades,
although
the
past
few
years
have
seen
a
tectonic
shift
towards
embracing
computational
technologies
in
both
academia
and
pharma.
This
is
largely
defined
by
flood
of
data
on
ligand
properties
binding
to
therapeutic
targets
their
3D
structures,
abundant
computing
capacities
advent
on-demand
virtual
libraries
drug-like
small
molecules
billions.
Taking
full
advantage
these
resources
requires
fast
methods
effective
screening.
includes
structure-based
screening
gigascale
chemical
spaces,
further
facilitated
iterative
approaches.
Highly
synergistic
are
developments
deep
learning
predictions
target
activities
lieu
receptor
structure.
Here
we
review
recent
advances
technologies,
potential
reshaping
whole
process
development,
as
well
challenges
they
encounter.
We
also
discuss
how
rapid
identification
highly
diverse,
potent,
target-selective
ligands
protein
can
democratize
process,
presenting
new
opportunities
cost-effective
development
safer
more
small-molecule
treatments.
Recent
approaches
application
streamlining
discussed.
Accounts of Chemical Research,
Год журнала:
2022,
Номер
55(17), С. 2454 - 2466
Опубликована: Авг. 10, 2022
We
must
accelerate
the
pace
at
which
we
make
technological
advancements
to
address
climate
change
and
disease
risks
worldwide.
This
swifter
of
discovery
requires
faster
research
development
cycles
enabled
by
better
integration
between
hypothesis
generation,
design,
experimentation,
data
analysis.
Typical
take
months
years.
However,
data-driven
automated
laboratories,
or
self-driving
can
significantly
molecular
materials
discovery.
Recently,
substantial
have
been
made
in
areas
machine
learning
optimization
algorithms
that
allowed
researchers
extract
valuable
knowledge
from
multidimensional
sets.
Machine
models
be
trained
on
large
sets
literature
databases,
but
their
performance
often
hampered
a
lack
negative
results
metadata.
In
contrast,
generated
laboratories
information-rich,
containing
precise
details
experimental
conditions
Consequently,
much
larger
amounts
high-quality
are
gathered
laboratories.
When
placed
open
repositories,
this
used
community
reproduce
experiments,
for
more
in-depth
analysis,
as
basis
further
investigation.
Accordingly,
will
increase
accessibility
reproducibility
science,
is
sorely
needed.In
Account,
describe
our
efforts
build
lab
new
class
materials:
organic
semiconductor
lasers
(OSLs).
Since
they
only
recently
demonstrated,
little
known
about
material
design
rules
thin-film,
electrically-pumped
OSL
devices
compared
other
technologies
such
light-emitting
diodes
photovoltaics.
To
realize
high-performing
materials,
developing
flexible
system
synthesis
via
iterative
Suzuki-Miyaura
cross-coupling
reactions.
platform
directly
coupled
analysis
purification
capabilities.
Subsequently,
molecules
interest
transferred
an
optical
characterization
setup.
currently
limited
measurements
solution.
properties
ultimately
most
important
solid
state
(e.g.,
thin-film
device).
end
different
scientific
goal,
inorganic
focused
oxygen
evolution
reaction.While
future
very
promising,
numerous
challenges
still
need
overcome.
These
split
into
cognition
motor
function.
Generally,
cognitive
related
with
constraints
unexpected
outcomes
general
algorithmic
solutions
yet
developed.
A
practical
challenge
could
resolved
near
software
control
because
few
instrument
manufacturers
products
mind.
Challenges
function
largely
handling
heterogeneous
systems,
dispensing
solids
performing
extractions.
As
result,
it
critical
understand
adapting
procedures
were
designed
human
experimenters
not
simple
transferring
those
same
actions
system,
there
may
efficient
ways
achieve
goal
fashion.
carefully
rethink
translation
manual
protocols.
Science,
Год журнала:
2022,
Номер
378(6618), С. 399 - 405
Опубликована: Окт. 27, 2022
General
conditions
for
organic
reactions
are
important
but
rare,
and
efforts
to
identify
them
usually
consider
only
narrow
regions
of
chemical
space.
Discovering
more
general
reaction
requires
considering
vast
space
derived
from
a
large
matrix
substrates
crossed
with
high-dimensional
conditions,
rendering
exhaustive
experimentation
impractical.
Here,
we
report
simple
closed-loop
workflow
that
leverages
data-guided
down-selection,
uncertainty-minimizing
machine
learning,
robotic
discover
conditions.
Application
the
challenging
consequential
problem
heteroaryl
Suzuki-Miyaura
cross-coupling
identified
double
average
yield
relative
widely
used
benchmark
was
previously
developed
using
traditional
approaches.
This
study
provides
practical
road
map
solving
multidimensional
optimization
problems
search
spaces.
Journal of the American Chemical Society,
Год журнала:
2022,
Номер
144(43), С. 19999 - 20007
Опубликована: Окт. 19, 2022
We
report
the
development
of
an
open-source
experimental
design
via
Bayesian
optimization
platform
for
multi-objective
reaction
optimization.
Using
high-throughput
experimentation
(HTE)
and
virtual
screening
data
sets
containing
high-dimensional
continuous
discrete
variables,
we
optimized
performance
by
fine-tuning
algorithm
components
such
as
encodings,
surrogate
model
parameters,
initialization
techniques.
Having
established
framework,
applied
optimizer
to
real-world
test
scenarios
simultaneous
yield
enantioselectivity
in
a
Ni/photoredox-catalyzed
enantioselective
cross-electrophile
coupling
styrene
oxide
with
two
different
aryl
iodide
substrates.
Starting
no
previous
data,
identified
conditions
that
surpassed
previously
human-driven
campaigns
within
15
24
experiments,
each
substrate,
among
1728
possible
configurations
available
To
make
more
accessible
nonexperts,
developed
graphical
user
interface
(GUI)
can
be
accessed
online
through
web-based
application
incorporated
features
condition
modification
on
fly
visualization.
This
web
does
not
require
software
installation,
removing
any
programming
barrier
use
platform,
which
enables
chemists
integrate
routines
into
their
everyday
laboratory
practices.
Chemical Science,
Год журнала:
2023,
Номер
14(19), С. 4997 - 5005
Опубликована: Янв. 1, 2023
The
lack
of
publicly
available,
large,
and
unbiased
datasets
is
a
key
bottleneck
for
the
application
machine
learning
(ML)
methods
in
synthetic
chemistry.
Data
from
electronic
laboratory
notebooks
(ELNs)
could
provide
less
biased,
large
datasets,
but
no
such
have
been
made
available.
first
real-world
dataset
ELNs
pharmaceutical
company
disclosed
its
relationship
to
high-throughput
experimentation
(HTE)
described.
For
chemical
yield
predictions,
task
synthesis,
an
attributed
graph
neural
network
(AGNN)
performs
as
well
or
better
than
best
previous
models
on
two
HTE
Suzuki-Miyaura
Buchwald-Hartwig
reactions.
However,
training
AGNN
ELN
does
not
lead
predictive
model.
implications
using
data
ML-based
are
discussed
context
predictions.
Chemical Reviews,
Год журнала:
2024,
Номер
124(16), С. 9633 - 9732
Опубликована: Авг. 13, 2024
Self-driving
laboratories
(SDLs)
promise
an
accelerated
application
of
the
scientific
method.
Through
automation
experimental
workflows,
along
with
autonomous
planning,
SDLs
hold
potential
to
greatly
accelerate
research
in
chemistry
and
materials
discovery.
This
review
provides
in-depth
analysis
state-of-the-art
SDL
technology,
its
applications
across
various
disciplines,
implications
for
industry.
additionally
overview
enabling
technologies
SDLs,
including
their
hardware,
software,
integration
laboratory
infrastructure.
Most
importantly,
this
explores
diverse
range
domains
where
have
made
significant
contributions,
from
drug
discovery
science
genomics
chemistry.
We
provide
a
comprehensive
existing
real-world
examples
different
levels
automation,
challenges
limitations
associated
each
domain.
Journal of the American Chemical Society,
Год журнала:
2023,
Номер
145(40), С. 21699 - 21716
Опубликована: Сен. 27, 2023
Exceptional
molecules
and
materials
with
one
or
more
extraordinary
properties
are
both
technologically
valuable
fundamentally
interesting,
because
they
often
involve
new
physical
phenomena
compositions
that
defy
expectations.
Historically,
exceptionality
has
been
achieved
through
serendipity,
but
recently,
machine
learning
(ML)
automated
experimentation
have
widely
proposed
to
accelerate
target
identification
synthesis
planning.
In
this
Perspective,
we
argue
the
data-driven
methods
commonly
used
today
well-suited
for
optimization
not
realization
of
exceptional
molecules.
Finding
such
outliers
should
be
possible
using
ML,
only
by
shifting
away
from
traditional
ML
approaches
tweak
composition,
crystal
structure,
reaction
pathway.
We
highlight
case
studies
high-Tc
oxide
superconductors
superhard
demonstrate
challenges
ML-guided
discovery
discuss
limitations
automation
task.
then
provide
six
recommendations
development
capable
discovery:
(i)
Avoid
tyranny
middle
focus
on
extrema;
(ii)
When
data
limited,
qualitative
predictions
direction
than
interpolative
accuracy;
(iii)
Sample
what
can
made
how
make
it
defer
optimization;
(iv)
Create
room
(and
look)
unexpected
while
pursuing
your
goal;
(v)
Try
fill-in-the-blanks
input
output
space;
(vi)
Do
confuse
human
understanding
model
interpretability.
conclude
a
description
these
integrated
into
workflows,
which
enable
materials.
Communications Chemistry,
Год журнала:
2024,
Номер
7(1)
Опубликована: Янв. 12, 2024
The
empirical
aspect
of
descriptor
design
in
catalyst
informatics,
particularly
when
confronted
with
limited
data,
necessitates
adequate
prior
knowledge
for
delving
into
unknown
territories,
thus
presenting
a
logical
contradiction.
This
study
introduces
technique
automatic
feature
engineering
(AFE)
that
works
on
small
datasets,
without
reliance
specific
assumptions
or
pre-existing
about
the
target
catalysis
designing
descriptors
and
building
machine-learning
models.
generates
numerous
features
through
mathematical
operations
general
physicochemical
catalytic
components
extracts
relevant
desired
catalysis,
essentially
screening
hypotheses
machine.
AFE
yields
reasonable
regression
results
three
types
heterogeneous
catalysis:
oxidative
coupling
methane
(OCM),
conversion
ethanol
to
butadiene,
three-way
where
only
training
set
is
swapped.
Moreover,
application
active
learning
combines
high-throughput
experimentation
OCM,
we
successfully
visualize
machine's
process
acquiring
precise
recognition
design.
Thus,
versatile
data-driven
research
key
step
towards
fully
automated
discoveries.