Machine Learning Science and Technology,
Journal Year:
2023,
Volume and Issue:
4(3), P. 035025 - 035025
Published: Aug. 11, 2023
Abstract
Virtual
screening
can
accelerate
drug
discovery
by
identifying
promising
candidates
for
experimental
evaluation.
Machine
learning
is
a
powerful
method
screening,
as
it
learn
complex
structure–property
relationships
from
data
and
make
rapid
predictions
over
virtual
libraries.
Molecules
inherently
exist
three-dimensional
ensemble
their
biological
action
typically
occurs
through
supramolecular
recognition.
However,
most
deep
approaches
to
molecular
property
prediction
use
2D
graph
representation
input,
in
some
cases
single
3D
conformation.
Here
we
investigate
how
the
information
of
multiple
conformers,
traditionally
known
4D
cheminformatics
community,
improve
models.
We
introduce
models
that
expand
upon
key
architectures
such
ChemProp
SchNet,
adding
elements
multiple-conformer
inputs
conformer
attention.
then
benchmark
performance
trade-offs
these
on
2D,
representations
activity
using
large
training
set
geometrically
resolved
molecules.
The
new
perform
significantly
better
than
models,
but
often
just
strong
with
many.
also
find
interpretable
attention
weights
each
conformer.
Chemical Society Reviews,
Journal Year:
2021,
Volume and Issue:
50(16), P. 9121 - 9151
Published: Jan. 1, 2021
COVID-19
has
resulted
in
huge
numbers
of
infections
and
deaths
worldwide
brought
the
most
severe
disruptions
to
societies
economies
since
Great
Depression.
Massive
experimental
computational
research
effort
understand
characterize
disease
rapidly
develop
diagnostics,
vaccines,
drugs
emerged
response
this
devastating
pandemic
more
than
130
000
COVID-19-related
papers
have
been
published
peer-reviewed
journals
or
deposited
preprint
servers.
Much
focused
on
discovery
novel
drug
candidates
repurposing
existing
against
COVID-19,
many
such
projects
either
exclusively
computer-aided
studies.
Herein,
we
provide
an
expert
overview
key
methods
their
applications
for
small-molecule
therapeutics
that
reported
literature.
We
further
outline
that,
after
first
year
pandemic,
it
appears
not
produced
rapid
global
solutions.
However,
several
known
used
clinic
cure
patients,
a
few
repurposed
continue
be
considered
clinical
trials,
along
with
candidates.
posit
truly
impactful
tools
must
deliver
actionable,
experimentally
testable
hypotheses
enabling
combinations,
open
science
sharing
results
are
critical
accelerate
development
novel,
much
needed
COVID-19.
Cell Reports Medicine,
Journal Year:
2022,
Volume and Issue:
3(12), P. 100794 - 100794
Published: Oct. 27, 2022
Recent
advances
and
accomplishments
of
artificial
intelligence
(AI)
deep
generative
models
have
established
their
usefulness
in
medicinal
applications,
especially
drug
discovery
development.
To
correctly
apply
AI,
the
developer
user
face
questions
such
as
which
protocols
to
consider,
factors
scrutinize,
how
can
integrate
relevant
disciplines.
This
review
summarizes
classical
newly
developed
AI
approaches,
providing
an
updated
accessible
guide
broad
computational
development
community.
We
introduce
from
different
standpoints
describe
theoretical
frameworks
for
representing
chemical
biological
structures
applications.
discuss
data
technical
challenges
highlight
future
directions
multimodal
accelerating
discovery.
Molecular
representation
learning
(MRL)
has
gained
tremendous
attention
due
to
its
critical
role
in
from
limited
supervised
data
for
applications
like
drug
design.
In
most
MRL
methods,
molecules
are
treated
as
1D
sequential
tokens
or
2D
topology
graphs,
limiting
their
ability
incorporate
3D
information
downstream
tasks
and,
particular,
making
it
almost
impossible
geometry
prediction/generation.
this
paper,
we
propose
a
universal
framework,
called
Uni-Mol,
that
significantly
enlarges
the
and
application
scope
of
schemes.
Uni-Mol
contains
two
pretrained
models
with
same
SE(3)
Transformer
architecture:
molecular
model
by
209M
conformations;
pocket
3M
candidate
protein
data.
Besides,
several
finetuning
strategies
apply
various
tasks.
By
properly
incorporating
information,
outperforms
SOTA
14/15
property
prediction
Moreover,
achieves
superior
performance
spatial
tasks,
including
protein-ligand
binding
pose
prediction,
conformation
generation,
etc.
The
code,
model,
made
publicly
available
at
https://github.com/dptech-corp/Uni-Mol.
The Journal of Chemical Physics,
Journal Year:
2024,
Volume and Issue:
160(11)
Published: March 21, 2024
Conformer–rotamer
sampling
tool
(CREST)
is
an
open-source
program
for
the
efficient
and
automated
exploration
of
molecular
chemical
space.
Originally
developed
in
Pracht
et
al.
[Phys.
Chem.
Phys.
22,
7169
(2020)]
as
driver
calculations
at
extended
tight-binding
level
(xTB),
it
offers
a
variety
molecular-
metadynamics
simulations,
geometry
optimization,
structure
analysis
capabilities.
Implemented
algorithms
include
procedures
conformational
sampling,
explicit
solvation
studies,
calculation
absolute
entropy,
identification
protonation
deprotonation
sites.
Calculations
are
set
up
to
run
concurrently,
providing
single-node
parallelization.
CREST
designed
require
minimal
user
input
comes
with
implementation
GFNn-xTB
Hamiltonians
GFN-FF
force-field.
Furthermore,
interfaces
any
quantum
chemistry
force-field
software
can
easily
be
created.
In
this
article,
we
present
recent
developments
code
show
selection
applications
most
important
features
program.
An
novelty
refactored
backend,
which
provides
significant
speed-up
small
or
medium-sized
drug
molecules
allows
more
sophisticated
setups,
example,
mechanics/molecular
mechanics
minimum
energy
crossing
point
calculations.
Scientific Data,
Journal Year:
2022,
Volume and Issue:
9(1)
Published: June 7, 2022
Abstract
Machine
learning
approaches
in
drug
discovery,
as
well
other
areas
of
the
chemical
sciences,
benefit
from
curated
datasets
physical
molecular
properties.
However,
there
currently
is
a
lack
data
collections
featuring
large
bioactive
molecules
alongside
first-principle
quantum
information.
The
open-access
QMugs
(Quantum-Mechanical
Properties
Drug-like
Molecules)
dataset
fills
this
void.
collection
comprises
mechanical
properties
more
than
665
k
biologically
and
pharmacologically
relevant
extracted
ChEMBL
database,
totaling
~2
M
conformers.
contains
optimized
geometries
thermodynamic
obtained
via
semi-empirical
method
GFN2-xTB.
Atomic
are
provided
on
both
GFN2-xTB
density-functional
levels
theory
(DFT,
ω
B97X-D/def2-SVP).
features
significantly
larger
size
previously-reported
their
respective
wave
functions,
including
DFT
density
orbital
matrices.
This
intended
to
facilitate
development
models
that
learn
different
while
also
providing
insight
into
corresponding
relationships
between
structure
biological
activity.
Drug Discovery Today,
Journal Year:
2023,
Volume and Issue:
28(4), P. 103516 - 103516
Published: Feb. 2, 2023
Over
the
past
decade,
amount
of
biomedical
data
available
has
grown
at
unprecedented
rates.
Increased
automation
technology
and
larger
volumes
have
encouraged
use
machine
learning
(ML)
or
artificial
intelligence
(AI)
techniques
for
mining
such
extracting
useful
patterns.
Because
identification
chemical
entities
with
desired
biological
activity
is
a
crucial
task
in
drug
discovery,
AI
technologies
potential
to
accelerate
this
process
support
decision
making.
In
addition,
advent
deep
(DL)
shown
great
promise
addressing
diverse
problems
as
de
novo
molecular
design.
Herein,
we
will
appraise
current
state-of-the-art
AI-assisted
discussing
recent
applications
covering
generative
models
structure
generation,
scoring
functions
improve
binding
affinity
pose
prediction,
dynamics
assist
parametrization,
featurization
generalization
tasks.
Finally,
discuss
hurdles
strategies
overcome
them,
well
future
directions.