Inorganic Chemistry,
Год журнала:
2024,
Номер
63(43), С. 20521 - 20530
Опубликована: Окт. 15, 2024
The
periodic
number
(PN)
representation
of
the
chemical
systems,
introduced
by
Dmitri
Mendeleev,
uncovers
fundamental
principle
similarity
in
a
straightforward
way.
In
this
framework,
rows
correspond
to
principal
quantum
numbers
elements'
electronic
configurations
when
considered
isolated
atoms.
This
systematic
arrangement
allows
for
deeper
understanding
relationships
and
patterns
among
elements.
study,
we
propose
novel
strategy
structure
type
(prototype)
prediction
utilizing
PN
concept
identify
possible
modifications
phase
stability
unexplored
systems.
Our
PN-based
crystal
(PNcsp)
program,
which
evaluates
through
neighboring
map,
provides
most
probable
prototypes
unknown
unreported
given
phases
binary
higher
order
We
applied
PNcsp
59
distinct
systems
whose
equimolar
are
indicated
respective
diagrams
but
lack
accurate
experimental
determination.
methodology
identified
93
these
equiatomic
phases,
47
exhibit
mechanical
dynamic
stability.
Notably,
approach
discovered
19
entirely
novel,
fully
stable
polymorphic
thereby
expanding
known
landscape
potential
materials.
Furthermore,
demonstrated
that
method
is
also
effective
nonequimolar
Chemical Reviews,
Год журнала:
2022,
Номер
122(15), С. 13006 - 13042
Опубликована: Июнь 27, 2022
Artificial
intelligence
and
specifically
machine
learning
applications
are
nowadays
used
in
a
variety
of
scientific
cutting-edge
technologies,
where
they
have
transformative
impact.
Such
an
assembly
statistical
linear
algebra
methods
making
use
large
data
sets
is
becoming
more
integrated
into
chemistry
crystallization
research
workflows.
This
review
aims
to
present,
for
the
first
time,
holistic
overview
cheminformatics
as
novel,
powerful
means
accelerate
discovery
new
crystal
structures,
predict
key
properties
organic
crystalline
materials,
simulate,
understand,
control
dynamics
complex
process
systems,
well
contribute
high
throughput
automation
chemical
development
involving
materials.
We
critically
advances
these
new,
rapidly
emerging
areas,
raising
awareness
issues
such
bridging
models
with
first-principles
mechanistic
models,
set
size,
structure,
quality,
selection
appropriate
descriptors.
At
same
we
propose
future
at
interface
applied
mathematics,
chemistry,
crystallography.
Overall,
this
increase
adoption
tools
by
chemists
scientists
across
industry
academia.
Nature Communications,
Год журнала:
2024,
Номер
15(1)
Опубликована: Дек. 6, 2024
Abstract
The
generation
of
plausible
crystal
structures
is
often
the
first
step
in
predicting
structure
and
properties
a
material
from
its
chemical
composition.
However,
most
current
methods
for
prediction
are
computationally
expensive,
slowing
pace
innovation.
Seeding
algorithms
with
quality
generated
candidates
can
overcome
major
bottleneck.
Here,
we
introduce
CrystaLLM,
methodology
versatile
structures,
based
on
autoregressive
large
language
modeling
(LLM)
Crystallographic
Information
File
(CIF)
format.
Trained
millions
CIF
files,
CrystaLLM
focuses
through
text.
produce
wide
range
inorganic
compounds
unseen
training,
as
demonstrated
by
ab
initio
simulations.
Our
approach
challenges
conventional
representations
crystals,
demonstrates
potential
LLMs
learning
effective
models
chemistry,
which
will
lead
to
accelerated
discovery
innovation
materials
science.
Advanced Science,
Год журнала:
2024,
Номер
unknown
Опубликована: Авг. 5, 2024
Abstract
Self‐supervised
neural
language
models
have
recently
achieved
unprecedented
success
from
natural
processing
to
learning
the
languages
of
biological
sequences
and
organic
molecules.
These
demonstrated
superior
performance
in
generation,
structure
classification,
functional
predictions
for
proteins
molecules
with
learned
representations.
However,
most
masking‐based
pre‐trained
are
not
designed
generative
design,
their
black‐box
nature
makes
it
difficult
interpret
design
logic.
Here
a
Blank‐filling
Language
Model
Materials
(BLMM)
Crystal
Transformer
is
proposed,
network‐based
probabilistic
model
tinkering
inorganic
materials.
The
built
on
blank‐filling
text
generation
has
unique
advantages
“materials
grammars”
together
high‐quality
interpretability,
data
efficiency.
It
can
generate
chemically
valid
materials
compositions
as
high
89.7%
charge
neutrality
84.8%
balanced
electronegativity,
which
more
than
four
eight
times
higher
compared
pseudo‐random
sampling
baseline.
process
BLMM
allows
recommend
operations
based
chemistry,
useful
doping.
applied
discover
set
new
validated
using
Density
Functional
Theory
(DFT)
calculations.
This
work
thus
brings
unsupervised
transformer
artificial
intelligence
A
user‐friendly
web
app
been
developed
be
accessed
freely
at
www.materialsatlas.org/blmtinker
.
Oxidation
states
(OS)
are
the
charges
on
atoms
due
to
electrons
gained
or
lost
upon
applying
an
ionic
approximation
their
bonds.
As
a
fundamental
property,
OS
has
been
widely
used
in
charge-neutrality
verification,
crystal
structure
determination,
and
reaction
estimation.
Currently,
only
heuristic
rules
exist
for
guessing
oxidation
of
given
compound
with
many
exceptions.
Recent
work
developed
machine
learning
models
based
structural
features
predicting
metal
ions.
However,
composition-based
state
prediction
still
remains
elusive
so
far,
which
significant
implications
discovery
new
materials
structures
have
not
determined.
This
proposes
novel
deep
learning-based
BERT
transformer
language
model
BERTOS
all
elements
inorganic
compounds
chemical
composition.
achieves
96.82%
accuracy
all-element
benchmarked
cleaned
ICSD
dataset
97.61%
oxide
materials.
It
is
also
demonstrated
how
it
can
be
conduct
large-scale
screening
hypothetical
material
compositions
discovery.
2D Materials,
Год журнала:
2024,
Номер
11(3), С. 032002 - 032002
Опубликована: Май 2, 2024
Abstract
Deep
learning
(DL)
methodologies
have
led
to
significant
advancements
in
various
domains,
facilitating
intricate
data
analysis
and
enhancing
predictive
accuracy
generation
quality
through
complex
algorithms.
In
materials
science,
the
extensive
computational
demands
associated
with
high-throughput
screening
techniques
such
as
density
functional
theory,
coupled
limitations
laboratory
production,
present
substantial
challenges
for
material
research.
DL
are
poised
alleviate
these
by
reducing
costs
of
simulating
properties
generating
novel
desired
attributes.
This
comprehensive
review
document
explores
current
state
applications
design,
a
particular
emphasis
on
two-dimensional
materials.
The
article
encompasses
an
in-depth
exploration
data-driven
approaches
both
forward
inverse
design
within
realm
science.
Machine Learning Science and Technology,
Год журнала:
2022,
Номер
4(1), С. 015001 - 015001
Опубликована: Дек. 21, 2022
Abstract
Pre-trained
transformer
language
models
(LMs)
on
large
unlabeled
corpus
have
produced
state-of-the-art
results
in
natural
processing,
organic
molecule
design,
and
protein
sequence
generation.
However,
no
such
been
applied
to
learn
the
composition
patterns
for
generative
design
of
material
compositions.
Here
we
train
a
series
seven
modern
(GPT,
GPT-2,
GPT-Neo,
GPT-J,
BLMM,
BART,
RoBERTa)
materials
using
expanded
formulas
ICSD,
OQMD,
Materials
Projects
databases.
Six
different
datasets
with/out
non-charge-neutral
or
EB
samples
are
used
benchmark
performances
uncover
biases
Our
experiments
show
that
transformers
based
causal
LMs
can
generate
chemically
valid
compositions
with
as
high
97.61%
be
charge
neutral
91.22%
electronegativity
balanced,
which
has
more
than
six
times
higher
enrichment
compared
baseline
pseudo-random
sampling
algorithm.
also
demonstrate
generation
novelty
their
potential
new
discovery
is
proved
by
capability
recover
leave-out
materials.
We
find
properties
generated
tailored
training
selected
sets
high-bandgap
samples.
each
own
preference
terms
running
time
complexity
varies
lot.
our
discover
set
validated
density
functional
theory
calculations.
All
trained
code
accessed
freely
at
http://www.github.com/usccolumbia/MTransformer
.