Automatic Prediction of Band Gaps of Inorganic Materials Using a Gradient Boosted and Statistical Feature Selection Workflow
Journal of Chemical Information and Modeling,
Journal Year:
2024,
Volume and Issue:
64(4), P. 1187 - 1200
Published: Feb. 6, 2024
Machine
learning
(ML)
methods
can
train
a
model
to
predict
material
properties
by
exploiting
patterns
in
materials
databases
that
arise
from
structure-property
relationships.
However,
the
importance
of
ML-based
feature
analysis
and
selection
is
often
neglected
when
creating
such
models.
Such
are
especially
important
dealing
with
multifidelity
data
because
they
afford
complex
space.
This
work
shows
how
gradient-boosted
statistical
feature-selection
workflow
be
used
predictive
models
classify
their
metallicity
band
gap
against
experimental
measurements,
as
well
computational
derived
electronic-structure
calculations.
These
fine-tuned
via
Bayesian
optimization,
using
solely
features
chemical
compositions
data.
We
test
these
experimental,
computational,
combination
find
modeling
option
reduce
number
required
model.
The
performance
our
benchmarked
state-of-the-art
algorithms,
results
which
demonstrate
approach
either
comparable
or
superior
them.
classification
realized
an
accuracy
score
0.943,
macro-averaged
F1-score
0.940,
area
under
curve
receiver
operating
characteristic
0.985,
average
precision
0.977,
while
regression
achieved
mean
absolute
error
0.246,
root-mean
squared
0.402,
Language: Английский
Automatic Prediction of Peak Optical Absorption Wavelengths in Molecules Using Convolutional Neural Networks
Journal of Chemical Information and Modeling,
Journal Year:
2024,
Volume and Issue:
64(5), P. 1486 - 1501
Published: Feb. 29, 2024
Molecular
design
depends
heavily
on
optical
properties
for
applications
such
as
solar
cells
and
polymer-based
batteries.
Accurate
prediction
of
these
is
essential,
multiple
predictive
methods
exist,
from
ab
initio
to
data-driven
techniques.
Although
theoretical
methods,
time-dependent
density
functional
theory
(TD-DFT)
calculations,
have
well-established
physical
relevance
are
among
the
most
popular
in
computational
physics
chemistry,
they
exhibit
errors
that
inherent
their
approximate
nature.
These
high-throughput
electronic
structure
calculations
also
incur
a
substantial
cost.
With
emergence
big-data
initiatives,
cost-effective,
gained
traction,
although
usability
highly
contingent
degree
data
quality
sparsity.
In
this
study,
we
present
workflow
employs
deep
residual
convolutional
neural
networks
(DR-CNN)
gradient
boosting
feature
selection
predict
peak
absorption
wavelengths
(λmax)
exclusively
SMILES
representations
dye
molecules
solvents;
one
would
normally
measure
λmax
using
UV–vis
spectroscopy.
We
use
multifidelity
modeling
approach,
integrating
34,893
DFT
26,395
experimentally
derived
data,
deliver
more
accurate
predictions
via
Bayesian-optimized
machine.
Our
approach
benchmarked
against
state
art
reported
scientific
literature;
results
demonstrate
learnt
DR-CNN
integrated
with
other
machine
learning
can
accelerate
specific
characteristics.
Language: Английский
Introduction to Machine Learning for Predictive Modeling I
Zhaoyang Chen,
No information about this author
Na Li,
No information about this author
Xiao Li
No information about this author
et al.
Challenges and advances in computational chemistry and physics,
Journal Year:
2025,
Volume and Issue:
unknown, P. 3 - 30
Published: Jan. 1, 2025
Language: Английский
Spatial-Temporal Dynamics in Country-Level Sustainable Energy Performance Using Ensemble Learning and Analytic Hierarchy Process
Amirreza Salehi,
No information about this author
Mahdi Alimohammadi,
No information about this author
Majid Khedmati
No information about this author
et al.
Journal of Cleaner Production,
Journal Year:
2025,
Volume and Issue:
unknown, P. 145497 - 145497
Published: April 1, 2025
Language: Английский
Machine-Learning Prediction of Curie Temperature from Chemical Compositions of Ferromagnetic Materials
Journal of Chemical Information and Modeling,
Journal Year:
2024,
Volume and Issue:
64(16), P. 6388 - 6409
Published: Aug. 7, 2024
Room-temperature
ferromagnets
are
high-value
targets
for
discovery
given
the
ease
by
which
they
could
be
embedded
within
magnetic
devices.
However,
multitude
of
potential
interactions
among
ions
and
their
surrounding
environments
renders
prediction
thermally
stable
properties
challenging.
Therefore,
it
is
vital
to
explore
methods
that
can
effectively
screen
candidates
expedite
novel
ferromagnetic
materials
highly
intricate
feature
spaces.
To
this
end,
we
machine-learning
(ML)
as
a
means
predict
Curie
temperature
(Tc)
discerning
patterns
databases.
This
study
emphasizes
importance
analysis
selection
in
ML
modeling
demonstrates
efficacy
our
gradient-boosted
statistical
feature-selection
workflow
training
predictive
models.
The
models
fine-tuned
through
Bayesian
optimization,
using
features
derived
solely
from
chemical
compositions
data,
before
model
predictions
evaluated
against
literature
values.
We
have
collated
ca.
35,000
Tc
values
performance
benchmarked
state-of-the-art
algorithms,
results
demonstrate
methodology
superior
majority
alternative
methods.
In
10-fold
cross-validation,
regression
realized
an
R2
(0.92
±
0.01),
MAE
(40.8
1.9)
K,
RMSE
(80.0
5.0)
K.
utility
case
studies
forecast
rare-earth
intermetallic
compounds
generate
phase
diagrams
various
systems.
These
highlight
systematic
approach
enhancing
both
capability
interpretability
models,
while
being
devoid
human
bias.
They
advantages
such
over
mere
reliance
on
algorithmic
complexity
black-box
treatment
ML-based
domain
computational
science.
Language: Английский
Machine-Learning Predictions of Critical Temperatures from Chemical Compositions of Superconductors
Journal of Chemical Information and Modeling,
Journal Year:
2024,
Volume and Issue:
unknown
Published: Sept. 17, 2024
In
the
quest
for
advanced
superconducting
materials,
accurate
prediction
of
critical
temperatures
(
Language: Английский
Predictive Modeling of High-Entropy Alloys and Amorphous Metallic Alloys Using Machine Learning
Journal of Chemical Information and Modeling,
Journal Year:
2024,
Volume and Issue:
unknown
Published: Oct. 1, 2024
High
entropy
alloys
and
amorphous
metallic
represent
two
distinct
classes
of
advanced
alloy
materials,
each
with
unique
structural
characteristics.
Their
emergence
has
garnered
considerable
interest
across
the
materials
science
engineering
communities,
driven
by
their
promising
properties,
including
exceptional
strength.
However,
extensive
compositional
diversity
poses
substantial
challenges
for
systematic
exploration,
as
traditional
experimental
approaches
high-throughput
calculations
struggle
to
efficiently
navigate
this
vast
space.
While
recent
development
in
data-driven
discovery
could
potentially
help,
such
efforts
are
hindered
scarcity
comprehensive
data
lack
robust
predictive
tools
that
can
effectively
link
composition
specific
properties.
To
address
these
challenges,
we
have
deployed
a
machine-learning-based
workflow
feature
selection
statistical
analysis
afford
models
accelerate
optimization
materials.
Our
methodology
is
validated
through
case
studies:
(i)
regression
bulk
modulus,
(ii)
classification
based
on
glass-forming
ability.
The
Bayesian-optimized
model
trained
prediction
modulus
achieved
an
Language: Английский
Machine Learning Approaches for Predicting Power Conversion Efficiency in Organic Solar Cells: A Comprehensive Review
Yang Jiang,
No information about this author
Chuang Yao,
No information about this author
Yezi Yang
No information about this author
et al.
Solar RRL,
Journal Year:
2024,
Volume and Issue:
unknown
Published: Oct. 9, 2024
Organic
solar
cells
(OSCs),
renowned
for
their
lightweight,
cost
efficiency,
and
adaptability
nature,
stand
out
as
a
promising
option
developing
renewable
energy.
Improving
the
power
conversion
efficiency
(PCE)
of
OSCs
is
essential,
researchers
are
delving
into
novel
materials
to
achieve
this.
Traditional
approaches
often
laborious
costly,
highlighting
need
predictive
modeling.
Machine
learning
(ML),
especially
via
quantitative
structure–property
relationship
(QSPR)
models,
streamlining
material
development,
with
goal
exceed
20%
PCE.
In
this
review,
application
ML
in
explored,
recent
studies
utilizing
PCE
prediction
reviewed,
encompassing
empirical
functions,
algorithms,
self‐devised
frameworks,
combination
automated
experimental
technologies.
First,
benefits
predicting
addressed.
Second,
development
high‐efficiency
models
both
fullerene
nonfullerene
acceptors
delved
into.
The
impact
various
algorithm
on
then
assessed,
taking
account
construction
models.
Moreover,
quality
databases
selection
descriptors
considered.
Databases
based
further
categorized.
Finally,
prospects
future
proposed.
Language: Английский
Automatic Prediction of Molecular Properties Using Substructure Vector Embeddings within a Feature Selection Workflow
Journal of Chemical Information and Modeling,
Journal Year:
2024,
Volume and Issue:
unknown
Published: Dec. 23, 2024
Machine
learning
(ML)
methods
provide
a
pathway
to
accurately
predict
molecular
properties,
leveraging
patterns
derived
from
structure–property
relationships
within
materials
databases.
This
approach
holds
significant
importance
in
drug
discovery
and
design,
where
the
rapid,
efficient
screening
of
molecules
can
accelerate
development
new
pharmaceuticals
chemical
for
highly
specialized
target
application.
Unsupervised
self-supervised
applied
graph-based
or
geometric
models
have
garnered
considerable
traction.
More
recently,
transformer-based
language
emerged
as
powerful
tools.
Nevertheless,
their
application
entails
computational
resources,
owing
need
an
extensive
pretraining
process
on
vast
corpus
unlabeled
data
sets.
To
this
end,
we
present
semisupervised
strategy
that
harnesses
substructure
vector
embeddings
conjunction
with
ML-based
feature
selection
workflow
various
properties.
We
evaluate
efficacy
our
modeling
methodology
across
diverse
range
sets,
encompassing
both
regression
classification
tasks.
Our
findings
demonstrate
superior
performance
compared
most
existing
state-of-the-art
algorithms,
while
offering
advantages
terms
balancing
model
accuracy
requirements.
Moreover,
provides
deeper
insights
into
interactions
are
essential
interpretability.
A
case
study
is
conducted
lipophilicity
molecules,
exemplifying
robustness
strategy.
The
result
underscores
meticulous
analysis
over
mere
reliance
predictive
high
degree
algorithmic
complexity.
Language: Английский
Negative Poisson's ratio of sulfides dominated by strong intralayer electron repulsion
Physical Chemistry Chemical Physics,
Journal Year:
2024,
Volume and Issue:
26(31), P. 20852 - 20863
Published: Jan. 1, 2024
Geometrical
variations
in
a
particular
structure
or
other
mechanical
factors
are
often
cited
as
the
cause
of
negative
Poisson's
ratio
(NPR).
Language: Английский