arXiv (Cornell University),
Год журнала:
2024,
Номер
unknown
Опубликована: Янв. 1, 2024
Large
language
models
(LLMs)
are
a
class
of
artificial
intelligence
based
on
deep
learning,
which
have
great
performance
in
various
tasks,
especially
natural
processing
(NLP).
typically
consist
neural
networks
with
numerous
parameters,
trained
large
amounts
unlabeled
input
using
self-supervised
or
semi-supervised
learning.
However,
their
potential
for
solving
bioinformatics
problems
may
even
exceed
proficiency
modeling
human
language.
In
this
review,
we
will
present
summary
the
prominent
used
processing,
such
as
BERT
and
GPT,
focus
exploring
applications
at
different
omics
levels
bioinformatics,
mainly
including
genomics,
transcriptomics,
proteomics,
drug
discovery
single
cell
analysis.
Finally,
review
summarizes
prospects
bioinformatic
problems.
Chemical Reviews,
Год журнала:
2023,
Номер
123(13), С. 8736 - 8780
Опубликована: Июнь 29, 2023
Small
data
are
often
used
in
scientific
and
engineering
research
due
to
the
presence
of
various
constraints,
such
as
time,
cost,
ethics,
privacy,
security,
technical
limitations
acquisition.
However,
big
have
been
focus
for
past
decade,
small
their
challenges
received
little
attention,
even
though
they
technically
more
severe
machine
learning
(ML)
deep
(DL)
studies.
Overall,
challenge
is
compounded
by
issues,
diversity,
imputation,
noise,
imbalance,
high-dimensionality.
Fortunately,
current
era
characterized
technological
breakthroughs
ML,
DL,
artificial
intelligence
(AI),
which
enable
data-driven
discovery,
many
advanced
ML
DL
technologies
developed
inadvertently
provided
solutions
problems.
As
a
result,
significant
progress
has
made
decade.
In
this
review,
we
summarize
analyze
several
emerging
potential
molecular
science,
including
chemical
biological
sciences.
We
review
both
basic
algorithms,
linear
regression,
logistic
regression
(LR),
Journal of the American Chemical Society,
Год журнала:
2023,
Номер
145(16), С. 8736 - 8750
Опубликована: Апрель 13, 2023
Traditional
computational
approaches
to
design
chemical
species
are
limited
by
the
need
compute
properties
for
a
vast
number
of
candidates,
e.g.,
discriminative
modeling.
Therefore,
inverse
methods
aim
start
from
desired
property
and
optimize
corresponding
structure.
From
machine
learning
viewpoint,
problem
can
be
addressed
through
so-called
generative
Mathematically,
models
defined
probability
distribution
function
given
molecular
or
material
In
contrast,
model
seeks
exploit
joint
with
target
characteristics.
The
overarching
idea
modeling
is
implement
system
that
produces
novel
compounds
expected
have
set
features,
effectively
sidestepping
issues
found
in
forward
process.
this
contribution,
we
overview
critically
analyze
popular
algorithms
like
adversarial
networks,
variational
autoencoders,
flow,
diffusion
models.
We
highlight
key
differences
between
each
models,
provide
insights
into
recent
success
stories,
discuss
outstanding
challenges
realizing
discovered
solutions
applications.
IEEE Transactions on Knowledge and Data Engineering,
Год журнала:
2024,
Номер
36(7), С. 2814 - 2830
Опубликована: Фев. 2, 2024
Deep
generative
models
have
unlocked
another
profound
realm
of
human
creativity.
By
capturing
and
generalizing
patterns
within
data,
we
entered
the
epoch
all-encompassing
Artificial
Intelligence
for
General
Creativity
(AIGC).
Notably,
diffusion
models,
recognized
as
one
paramount
materialize
ideation
into
tangible
instances
across
diverse
domains,
encompassing
imagery,
text,
speech,
biology,
healthcare.
To
provide
advanced
comprehensive
insights
diffusion,
this
survey
comprehensively
elucidates
its
developmental
trajectory
future
directions
from
three
distinct
angles:
fundamental
formulation
algorithmic
enhancements,
manifold
applications
diffusion.
Each
layer
is
meticulously
explored
to
offer
a
comprehension
evolution.
Structured
summarized
approaches
are
presented
here.
2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV),
Год журнала:
2023,
Номер
unknown, С. 702 - 712
Опубликована: Янв. 1, 2023
Unpaired
image-to-image
translation
has
broad
applications
in
art,
design,
and
scientific
simulations.
One
early
breakthrough
was
CycleGAN
that
emphasizes
one-to-one
mappings
between
two
unpaired
image
domains
via
generative-adversarial
networks
(GAN)
coupled
with
the
cycle-consistency
constraint,
while
more
recent
works
promote
one-to-many
mapping
to
boost
diversity
of
translated
images.
Motivated
by
simulation
needs,
this
work
revisits
classic
framework
boosts
its
performance
outperform
contemporary
models
without
relaxing
constraint.
To
achieve
this,
we
equip
generator
a
Vision
Transformer
(ViT)
employ
necessary
training
regularization
techniques.
Compared
previous
best-performing
models,
our
model
performs
better
retains
strong
correlation
original
image.
An
accompanying
ablation
study
shows
both
gradient
penalty
self-supervised
pre-training
are
crucial
improvement.
reproducibility
open
science,
source
code,
hyperparameter
configurations,
pre-trained
available
at
https://github.com/LS4GAN/uvcgan.
Patterns,
Год журнала:
2023,
Номер
4(2), С. 100678 - 100678
Опубликована: Фев. 1, 2023
Molecular
discovery
is
a
multi-objective
optimization
problem
that
requires
identifying
molecule
or
set
of
molecules
balance
multiple,
often
competing,
properties.
Multi-objective
molecular
design
commonly
addressed
by
combining
properties
interest
into
single
objective
function
using
scalarization,
which
imposes
assumptions
about
relative
importance
and
uncovers
little
the
trade-offs
between
objectives.
In
contrast
to
Pareto
does
not
require
knowledge
reveals
However,
it
introduces
additional
considerations
in
algorithm
design.
this
review,
we
describe
pool-based
de
novo
generative
approaches
with
focus
on
algorithms.
We
show
how
relatively
direct
extension
Bayesian
plethora
different
models
extend
from
single-objective
similar
ways
non-dominated
sorting
reward
(reinforcement
learning)
select
for
retraining
(distribution
propagation
(genetic
algorithms).
Finally,
discuss
some
remaining
challenges
opportunities
field,
emphasizing
opportunity
adopt
techniques
International Journal of Molecular Sciences,
Год журнала:
2022,
Номер
23(21), С. 13568 - 13568
Опубликована: Ноя. 5, 2022
Traditional
drug
design
requires
a
great
amount
of
research
time
and
developmental
expense.
Booming
computational
approaches,
including
biology,
computer-aided
design,
artificial
intelligence,
have
the
potential
to
expedite
efficiency
discovery
by
minimizing
financial
cost.
In
recent
years,
approaches
are
being
widely
used
improve
efficacy
effectiveness
pipeline,
leading
approval
plenty
new
drugs
for
marketing.
The
present
review
emphasizes
on
applications
these
indispensable
in
aiding
target
identification,
lead
discovery,
optimization.
Some
challenges
using
also
discussed.
Moreover,
we
propose
methodology
integrating
various
techniques
into
design.
Journal of Cheminformatics,
Год журнала:
2024,
Номер
16(1)
Опубликована: Фев. 21, 2024
REINVENT
4
is
a
modern
open-source
generative
AI
framework
for
the
design
of
small
molecules.
The
software
utilizes
recurrent
neural
networks
and
transformer
architectures
to
drive
molecule
generation.
These
generators
are
seamlessly
embedded
within
general
machine
learning
optimization
algorithms,
transfer
learning,
reinforcement
curriculum
learning.
enables
facilitates
de
novo
design,
R-group
replacement,
library
linker
scaffold
hopping
optimization.
This
contribution
gives
an
overview
describes
its
design.
Algorithms
their
applications
discussed
in
detail.
command
line
tool
which
reads
user
configuration
either
TOML
or
JSON
format.
aim
this
release
provide
reference
implementations
some
most
common
algorithms
based
An
additional
goal
with
create
education
future
innovation
molecular
available
from
https://github.com/MolecularAI/REINVENT4
released
under
permissive
Apache
2.0
license.
Scientific
contribution.
provides
implementation
where
also
being
used
production
support
in-house
drug
discovery
projects.
publication
one
code
full
documentation
thereof
will
increase
transparency
foster
innovation,
collaboration
education.