Genomic language models: opportunities and challenges
Gonzalo Benegas,
No information about this author
Chengzhong Ye,
No information about this author
Carlos Albors
No information about this author
et al.
Trends in Genetics,
Journal Year:
2025,
Volume and Issue:
unknown
Published: Jan. 1, 2025
Language: Английский
Evaluating the representational power of pre-trained DNA language models for regulatory genomics
Ziqi Tang,
No information about this author
Nikunj V. Somia,
No information about this author
Yiyang Yu
No information about this author
et al.
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2024,
Volume and Issue:
unknown
Published: March 4, 2024
ABSTRACT
The
emergence
of
genomic
language
models
(gLMs)
offers
an
unsupervised
approach
to
learning
a
wide
diversity
cis
-regulatory
patterns
in
the
non-coding
genome
without
requiring
labels
functional
activity
generated
by
wet-lab
experiments.
Previous
evaluations
have
shown
that
pre-trained
gLMs
can
be
leveraged
improve
predictive
performance
across
broad
range
regulatory
genomics
tasks,
albeit
using
relatively
simple
benchmark
datasets
and
baseline
models.
Since
these
studies
were
tested
upon
fine-tuning
their
weights
for
each
downstream
task,
determining
whether
gLM
representations
embody
foundational
understanding
biology
remains
open
question.
Here
we
evaluate
representational
power
predict
interpret
cell-type-specific
data
span
DNA
RNA
regulation.
Our
findings
suggest
probing
do
not
offer
substantial
advantages
over
conventional
machine
approaches
use
one-hot
encoded
sequences.
This
work
highlights
major
gap
with
current
gLMs,
raising
potential
issues
pre-training
strategies
genome.
Language: Английский
Large language model applications in nucleic acid research
Published: Jan. 1, 2025
Language: Английский
Identification, characterization, and design of plant genome sequences using deep learning
Zhenye Wang,
No information about this author
Hao Yuan,
No information about this author
Jianbing Yan
No information about this author
et al.
The Plant Journal,
Journal Year:
2024,
Volume and Issue:
unknown
Published: Dec. 12, 2024
SUMMARY
Due
to
its
excellent
performance
in
processing
large
amounts
of
data
and
capturing
complex
non‐linear
relationships,
deep
learning
has
been
widely
applied
many
fields
plant
biology.
Here
we
first
review
the
application
analyzing
genome
sequences
predict
gene
expression,
chromatin
interactions,
epigenetic
features
(open
chromatin,
transcription
factor
binding
sites,
methylation
sites)
plants.
Then,
current
motif
mining
functional
component
design
synthesis
based
on
generative
adversarial
networks,
models,
attention
mechanisms
are
elaborated
detail.
The
progress
protein
structure
function
prediction,
genomic
model
applications
is
also
discussed.
Finally,
this
work
provides
prospects
for
future
development
plants
with
regard
multiple
omics
data,
algorithm
optimization,
language
sequence
design,
intelligent
breeding.
Language: Английский
The maize recombination landscape evolved during domestication.
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2024,
Volume and Issue:
unknown
Published: Nov. 6, 2024
Abstract
Meiotic
recombination
is
an
important
evolutionary
process
because
it
can
increase
the
amount
of
genetic
variation
within
populations
through
breakage
unfavorable
linkages
and
creation
novel
allelic
combinations.
Despite
plethora
knowledge
about
population-level
benefits
numerous
theoretical
studies
examining
how
rates
evolve
over
time,
there
a
lack
empirical
evidence
for
any
hypotheses
that
have
been
put
forward.
To
alleviate
this
gap
in
knowledge,
we
characterized
evolution
landscape
Zea
mays
ssp.
(maize)
during
its
domestication
from
parviglumis
(teosinte),
explored
permitted
maize
tied
these
alterations
to
changes
basis
recombination.
Using
experimental
population
genomics
approach
ancestral
graph
(ARG)
inference,
our
data
demonstrated
had
12%
genome-wide
rate
domestication.
Although
teosinte
landscapes
are
highly
correlated,
r
=
0.85
at
1Mb
resolution,
has
evolved
higher
recombining
regions
interstitial
chromosome
regions,
compared
which
only
harbors
high
sub-telomerically.
Our
show
re-patterning
COs
towards
came
reduced
CO
interference
levels
maize.
Supporting
idea
maize,
found
selection
acting
on
trans-acting
recombination-modifiers
participate
class
I
pathway
or
directly.
Lastly,
showed
was
beneficial
significantly
increased
were
targeted
gene-rich
harboring
related
loci.
Because
with
significant
increases
lower
deleterious
mutation
load,
decreases
recombination,
concluded
domestication-related
acted
upon
domestication,
shielded
Hill-Robertson
effect.
In
conclusion,
events
allowed
adapt
faster
than
previously
understood.
Language: Английский
Genomic resources, opportunities, and prospects for accelerated improvement of millets
Theoretical and Applied Genetics,
Journal Year:
2024,
Volume and Issue:
137(12)
Published: Nov. 20, 2024
Genomic
resources,
alongside
the
tools
and
expertise
required
to
leverage
them,
are
essential
for
effective
improvement
of
globally
significant
millet
crop
species.
Millets
global
food
security
nutrition,
particularly
in
sub-Saharan
Africa
South
Asia.
They
crucial
promoting
climate
resilience,
economic
development,
cultural
heritage.
Despite
their
critical
role,
millets
have
historically
received
less
investment
developing
genomic
resources
than
major
cereals
like
wheat,
maize,
rice.
However,
recent
advancements
genomics,
next-generation
sequencing
technologies,
offer
unprecedented
opportunities
rapid
crops.
This
review
paper
provides
an
overview
status
harnessing
artificial
intelligence
address
challenges
boost
productivity,
end
quality.
It
emphasizes
significance
genomics
tackling
issues
underscores
necessity
innovative
breeding
strategies
translate
AI
into
millets.
Language: Английский