AI/ML methodologies and the future-will they be successful in designing the next generation of new chemical entities?
Journal of Cheminformatics,
Год журнала:
2025,
Номер
17(1)
Опубликована: Апрель 6, 2025
Cheminformatics
and
chemical
databases
are
essential
to
drug
discovery.
However,
machine
learning
(ML)
artificial
intelligence
(AI)
methodologies
changing
the
way
in
which
data
is
used.
How
will
use
of
change
discovery
moving
forward?
do
new
ML
methods
molecular
property
prediction,
hit
lead
target
identification
structure
prediction
differ
compare
with
previous
computational
methods?
Will
improve
diversity
ligand
design,
offer
enhancements.
There
still
many
advantages
physics
based
they
something
lacking
ML/
AI
methods.
Additionally,
training
often
give
best
results
when
experimental
assay
measurements
fed
back
into
model.
Often
modeling
not
diametrically
opposed
but
greatest
advantage
used
complementary.
Язык: Английский
Evaluating Molecular Similarity Measures: Do Similarity Measures Reflect Electronic Structure Properties?
Journal of Chemical Information and Modeling,
Год журнала:
2025,
Номер
unknown
Опубликована: Апрель 29, 2025
The
rapid
adoption
of
big
data,
machine
learning
(ML),
and
generative
artificial
intelligence
(AI)
in
chemical
discovery
has
heightened
the
importance
quantifying
molecular
similarity.
Molecular
similarity,
commonly
assessed
as
distance
between
fingerprints,
is
integral
to
applications
such
database
curation,
diversity
analysis,
property
prediction.
AI
tools
frequently
rely
on
these
similarity
measures
cluster
molecules
under
assumption
that
structurally
similar
exhibit
properties.
However,
this
not
universally
valid,
particularly
for
continuous
properties
like
electronic
structure
Despite
prevalence
fingerprint-based
measures,
their
evaluation
largely
depended
biological
activity
data
sets
qualitative
metrics,
limiting
relevance
nonbiological
domains.
To
address
gap,
we
propose
a
framework
evaluate
correlation
Our
approach
builds
concept
neighborhood
behavior
incorporates
kernel
density
estimation
(KDE)
analysis
quantify
how
well
capture
relationships.
Using
set
over
350
million
molecule
pairs
with
structure,
redox,
optical
properties,
systematically
several
fingerprint
generators,
functions,
Both
curated
are
publicly
available.
Язык: Английский
Machine Learning Pipeline for Molecular Property Prediction Using ChemXploreML
Journal of Chemical Information and Modeling,
Год журнала:
2025,
Номер
unknown
Опубликована: Май 20, 2025
We
present
ChemXploreML,
a
modular
desktop
application
designed
for
machine
learning-based
molecular
property
prediction.
The
framework's
flexible
architecture
allows
integration
of
any
embedding
technique
with
modern
learning
algorithms,
enabling
researchers
to
customize
their
prediction
pipelines
without
extensive
programming
expertise.
To
demonstrate
the
capabilities,
we
implement
and
evaluate
two
approaches─Mol2Vec
VICGAE
(Variance-Invariance-Covariance
regularized
GRU
Auto-Encoder)─combined
state-of-the-art
tree-based
ensemble
methods
(Gradient
Boosting
Regression,
XGBoost,
CatBoost,
LightGBM).
Using
five
fundamental
properties
as
test
cases─melting
point,
boiling
vapor
pressure,
critical
temperature
(CT),
pressure─we
validate
our
framework
on
data
set
from
CRC
Handbook
Chemistry
Physics.
models
achieve
excellent
performance
well-distributed
properties,
R2
values
up
0.93
CT
predictions.
Notably,
while
Mol2Vec
embeddings
(300
dimensions)
delivered
slightly
higher
accuracy,
(32
exhibited
comparable
yet
offered
significantly
improved
computational
efficiency.
ChemXploreML's
design
facilitates
easy
new
techniques
providing
platform
customized
tasks.
automates
chemical
preprocessing
(including
UMAP-based
exploration
space),
model
optimization,
analysis
through
an
intuitive
interface,
making
sophisticated
accessible
maintaining
extensibility
advanced
cheminformatics
users.
Язык: Английский
Interpretable Multimodal Deep Ensemble Framework Dissecting Bloodbrain Barrier Permeability with Molecular Features
The Journal of Physical Chemistry Letters,
Год журнала:
2025,
Номер
unknown, С. 5806 - 5819
Опубликована: Июнь 3, 2025
Blood-brain
barrier
permeability
(BBBP)
prediction
plays
a
critical
role
in
the
drug
discovery
process,
particularly
for
compounds
targeting
central
nervous
system.
While
machine
learning
(ML)
has
significantly
advanced
of
BBBP,
there
remains
an
urgent
need
interpretable
ML
models
that
can
reveal
physicochemical
principles
governing
BBB
permeability.
In
this
study,
we
propose
multimodal
framework
integrates
molecular
fingerprints
(Morgan,
MACCS,
RDK)
and
image
features
to
improve
BBBP
prediction.
The
classification
task
(BBB-permeable
vs
nonpermeable)
is
addressed
with
stacking
ensemble
model
combining
multiple
base
classifiers.
proposed
demonstrates
competitive
predictive
stability,
generalization
ability,
feature
interpretability
compared
recent
approaches,
under
comparable
evaluation
settings.
Beyond
performance,
our
incorporates
Principal
Component
Analysis
(PCA)
Shapley
Additive
Explanations
(SHAP)
analysis
highlight
key
fingerprint
contributing
predictions.
regression
(logBB
value
prediction)
tackled
by
multi-input
deep
framework,
incorporating
Transformer
encoder
processing,
convolutional
neural
network
(CNN)
extraction,
Multi-Head
Attention
fusion
mechanism
enhance
interactions.
maps
derived
from
token-level
relationships
within
representations.
This
work
provides
modeling
enhanced
transparency
mechanistic
insight
lays
foundation
future
studies
transparent
descriptors
physics-informed
features.
Язык: Английский