The Journal of Chemical Physics,
Год журнала:
2023,
Номер
158(16)
Опубликована: Апрель 27, 2023
Deep
learning
has
emerged
as
a
promising
paradigm
to
give
access
highly
accurate
predictions
of
molecular
and
material
properties.
A
common
short-coming
shared
by
current
approaches,
however,
is
that
neural
networks
only
point
estimates
their
do
not
come
with
predictive
uncertainties
associated
these
estimates.
Existing
uncertainty
quantification
efforts
have
primarily
leveraged
the
standard
deviation
across
an
ensemble
independently
trained
networks.
This
incurs
large
computational
overhead
in
both
training
prediction,
resulting
order-of-magnitude
more
expensive
predictions.
Here,
we
propose
method
estimate
based
on
single
network
without
need
for
ensemble.
allows
us
obtain
virtually
no
additional
over
inference.
We
demonstrate
quality
matches
those
obtained
from
deep
ensembles.
further
examine
our
methods
ensembles
configuration
space
test
system
compare
potential
energy
surface.
Finally,
study
efficacy
active
setting
find
results
match
ensemble-based
strategy
at
reduced
cost.
Accounts of Chemical Research,
Год журнала:
2020,
Номер
54(2), С. 263 - 270
Опубликована: Дек. 28, 2020
ConspectusRecent
advances
in
computer
hardware
and
software
have
led
to
a
revolution
deep
neural
networks
that
has
impacted
fields
ranging
from
language
translation
vision.
Deep
learning
also
number
of
areas
drug
discovery,
including
the
analysis
cellular
images
design
novel
routes
for
synthesis
organic
molecules.
While
work
these
been
impactful,
complete
review
applications
discovery
would
be
beyond
scope
single
Account.
In
this
Account,
we
will
focus
on
two
key
where
molecular
design:
prediction
properties
de
novo
generation
suggestions
new
molecules.One
most
significant
development
quantitative
structure–activity
relationships
(QSARs)
come
application
methods
biological
activity
physical
molecules
programs.
Rather
than
employing
expert-derived
chemical
features
typically
used
build
predictive
models,
researchers
are
now
using
develop
representations.
These
representations,
coupled
with
ability
uncover
complex,
nonlinear
relationships,
state-of-the-art
performance.
changed
way
many
approach
QSARs,
it
is
not
panacea.
As
any
other
machine
task,
models
dependent
quality,
quantity,
relevance
available
data.
Seemingly
fundamental
issues,
such
as
optimal
creating
training
set,
still
open
questions
field.
Another
critical
area
subject
multiple
research
efforts
assessing
confidence
model.Deep
contributed
renaissance
molecule
generation.
relying
manually
defined
heuristics,
learn
generate
based
sets
existing
Techniques
were
originally
developed
image
adapted
described
above
being
specific
predicted
profiles.
generative
algorithms
appear
promising,
there
only
few
reports
testing
designs
proposed
by
models.
The
evaluation
diversity,
ultimate
value
produced
an
question.
field
benchmarks,
yet
agree
how
one
should
ultimately
assess
"invented"
algorithm.
Wiley Interdisciplinary Reviews Computational Molecular Science,
Год журнала:
2022,
Номер
12(5)
Опубликована: Март 5, 2022
Abstract
Development
of
new
products
often
relies
on
the
discovery
novel
molecules.
While
conventional
molecular
design
involves
using
human
expertise
to
propose,
synthesize,
and
test
molecules,
this
process
can
be
cost
time
intensive,
limiting
number
molecules
that
reasonably
tested.
Generative
modeling
provides
an
alternative
approach
by
reformulating
as
inverse
problem.
Here,
we
review
recent
advances
in
state‐of‐the‐art
generative
discusses
considerations
for
integrating
these
models
into
real
campaigns.
We
first
model
choices
required
develop
train
a
including
common
1D,
2D,
3D
representations
typical
neural
network
architectures.
then
describe
different
problem
statements
applications
explore
benchmarks
used
evaluate
based
those
statements.
Finally,
discuss
important
factors
play
role
experimental
workflows.
Our
aim
is
will
equip
reader
with
information
context
necessary
utilize
within
their
domain.
This
article
categorized
under:
Data
Science
>
Artificial
Intelligence/Machine
Learning
Chemical Reviews,
Год журнала:
2023,
Номер
123(13), С. 8736 - 8780
Опубликована: Июнь 29, 2023
Small
data
are
often
used
in
scientific
and
engineering
research
due
to
the
presence
of
various
constraints,
such
as
time,
cost,
ethics,
privacy,
security,
technical
limitations
acquisition.
However,
big
have
been
focus
for
past
decade,
small
their
challenges
received
little
attention,
even
though
they
technically
more
severe
machine
learning
(ML)
deep
(DL)
studies.
Overall,
challenge
is
compounded
by
issues,
diversity,
imputation,
noise,
imbalance,
high-dimensionality.
Fortunately,
current
era
characterized
technological
breakthroughs
ML,
DL,
artificial
intelligence
(AI),
which
enable
data-driven
discovery,
many
advanced
ML
DL
technologies
developed
inadvertently
provided
solutions
problems.
As
a
result,
significant
progress
has
made
decade.
In
this
review,
we
summarize
analyze
several
emerging
potential
molecular
science,
including
chemical
biological
sciences.
We
review
both
basic
algorithms,
linear
regression,
logistic
regression
(LR),
Chemie Ingenieur Technik,
Год журнала:
2021,
Номер
93(12), С. 2029 - 2039
Опубликована: Окт. 22, 2021
Abstract
The
transformation
of
the
chemical
industry
to
renewable
energy
and
feedstock
supply
requires
new
paradigms
for
design
flexible
plants,
(bio‐)catalysts,
functional
materials.
Recent
breakthroughs
in
machine
learning
(ML)
provide
unique
opportunities,
but
only
joint
interdisciplinary
research
between
ML
engineering
(CE)
communities
will
unfold
full
potential.
We
identify
six
challenges
that
open
methods
CE
formulate
types
problems
ML:
(1)
optimal
decision
making,
(2)
introducing
enforcing
physics
ML,
(3)
information
knowledge
representation,
(4)
heterogeneity
data,
(5)
safety
trust
applications,
(6)
creativity.
Under
umbrella
these
challenges,
we
discuss
perspectives
future
enable
CE.
ACS Central Science,
Год журнала:
2021,
Номер
7(8), С. 1356 - 1367
Опубликована: Июль 27, 2021
While
neural
networks
achieve
state-of-the-art
performance
for
many
molecular
modeling
and
structure-property
prediction
tasks,
these
models
can
struggle
with
generalization
to
out-of-domain
examples,
exhibit
poor
sample
efficiency,
produce
uncalibrated
predictions.
In
this
paper,
we
leverage
advances
in
evidential
deep
learning
demonstrate
a
new
approach
uncertainty
quantification
network-based
at
no
additional
computational
cost.
We
develop
both
2D
message
passing
3D
atomistic
apply
across
range
of
different
tasks.
that
uncertainties
enable
(1)
calibrated
predictions
where
correlates
error,
(2)
sample-efficient
training
through
uncertainty-guided
active
learning,
(3)
improved
experimental
validation
rates
retrospective
virtual
screening
campaign.
Our
results
suggest
provide
an
efficient
means
useful
property
prediction,
discovery,
design
tasks
the
chemical
physical
sciences.
Polymer
membranes
perform
innumerable
separations
with
far-reaching
environmental
implications.
Despite
decades
of
research,
design
new
membrane
materials
remains
a
largely
Edisonian
process.
To
address
this
shortcoming,
we
demonstrate
generalizable,
accurate
machine
learning
(ML)
implementation
for
the
discovery
innovative
polymers
ideal
performance.
Specifically,
multitask
ML
models
are
trained
on
experimental
data
to
link
polymer
chemistry
gas
permeabilities
He,
H2,
O2,
N2,
CO2,
and
CH4.
We
interpret
extract
valuable
insights
into
contributions
different
chemical
moieties
permeability
selectivity.
then
screen
over
9
million
hypothetical
identify
thousands
that
lie
well
above
current
performance
upper
bounds,
including
hundreds
never-before-seen
ultrapermeable
O2
CO2
greater
than
104
105
Barrers,
respectively.
High-fidelity
molecular
dynamics
simulations
confirm
ML-predicted
promising
candidates,
which
suggests
many
can
be
translated
reality.
Science,
Год журнала:
2022,
Номер
378(6618), С. 399 - 405
Опубликована: Окт. 27, 2022
General
conditions
for
organic
reactions
are
important
but
rare,
and
efforts
to
identify
them
usually
consider
only
narrow
regions
of
chemical
space.
Discovering
more
general
reaction
requires
considering
vast
space
derived
from
a
large
matrix
substrates
crossed
with
high-dimensional
conditions,
rendering
exhaustive
experimentation
impractical.
Here,
we
report
simple
closed-loop
workflow
that
leverages
data-guided
down-selection,
uncertainty-minimizing
machine
learning,
robotic
discover
conditions.
Application
the
challenging
consequential
problem
heteroaryl
Suzuki-Miyaura
cross-coupling
identified
double
average
yield
relative
widely
used
benchmark
was
previously
developed
using
traditional
approaches.
This
study
provides
practical
road
map
solving
multidimensional
optimization
problems
search
spaces.
Natural Resources Research,
Год журнала:
2022,
Номер
31(3), С. 1351 - 1373
Опубликована: Апрель 12, 2022
Abstract
Uncertainty
quantification
(
UQ
)
is
an
important
benchmark
to
assess
the
performance
of
artificial
intelligence
AI
and
particularly
deep
learning
ensembled-based
models.
However,
ability
for
using
current
-based
methods
not
only
limited
in
terms
computational
resources
but
it
also
requires
changes
topology
optimization
processes,
as
well
multiple
performances
monitor
model
instabilities.
From
both
geo-engineering
societal
perspectives,
a
predictive
groundwater
table
GWT
presents
challenge,
where
lack
limits
validity
findings
may
undermine
science-based
decisions.
To
overcome
address
these
limitations,
novel
ensemble,
automated
random
deactivating
connective
weights
approach
ARDCW
),
presented
applied
retrieved
geographical
locations
data
from
project
Stockholm,
Sweden.
In
this
approach,
was
achieved
via
combination
several
derived
ensembles
fixed
optimum
subjected
randomly
switched
off
weights,
which
allow
predictability
with
one
forward
pass.
The
process
developed
programmed
provide
trackable
specific
task
access
wide
variety
different
internal
characteristics
libraries.
A
comparison
Monte
Carlo
dropout
quantile
regression
computer
vision
control
metrics
showed
significant
progress
.
This
does
require
can
be
already
trained
topologies
way
that
outperforms
other
Advanced Materials,
Год журнала:
2022,
Номер
34(36)
Опубликована: Апрель 22, 2022
Abstract
Owing
to
the
rapid
developments
improve
accuracy
and
efficiency
of
both
experimental
computational
investigative
methodologies,
massive
amounts
data
generated
have
led
field
materials
science
into
fourth
paradigm
data‐driven
scientific
research.
This
transition
requires
development
authoritative
up‐to‐date
frameworks
for
approaches
material
innovation.
A
critical
discussion
on
current
advances
in
discovery
with
a
focus
frameworks,
machine‐learning
algorithms,
material‐specific
databases,
descriptors,
targeted
applications
inorganic
is
presented.
Frameworks
rationalizing
innovation
are
described,
review
essential
subdisciplines
presented,
including:
i)
advanced
data‐intensive
strategies
algorithms;
ii)
databases
related
tools
platforms
generation
management;
iii)
commonly
used
molecular
descriptors
processes.
Furthermore,
an
in‐depth
broad
innovation,
such
as
energy
conversion
storage,
environmental
decontamination,
flexible
electronics,
optoelectronics,
superconductors,
metallic
glasses,
magnetic
materials,
provided.
Finally,
how
these
(with
insights
synergy
science,
tools,
mathematics)
support
paradigms
outlined,
opportunities
challenges
highlighted.