Research Square (Research Square),
Journal Year:
2023,
Volume and Issue:
unknown
Published: Jan. 4, 2023
Abstract
The
domestication
process
in
Lima
bean
(
Phaseolus
lunatus
L.
)
involves
at
least
two
independent
events,
within
the
Mesoamerican
and
Andean
gene
pools.
Both
processes
produced
similar
phenotypic
changes
landraces,
making
an
excellent
model
to
understand
convergent
evolution.
Despite
recent
research
efforts,
mechanisms
of
adaptation
followed
by
landraces
are
largely
unknown.
genes
related
these
adaptations
can
be
selected
identification
selective
sweeps
Most
previous
genetic
analyses
have
relied
on
Single
Nucleotide
Polymorphism
(SNP)
loci
ignored
transposable
elements
(TEs)
which
a
major
source
variation
plant
genomes.
current
availability
high-throughput
sequencing
technologies
enables
collection
whole-genome
resequencing
(WGS)
data
approach
intraspecies
population
dynamics
TEs.
present
collected
WGS
from
60
wild
domesticated
accessions
generate
most
complete
characterization
developed
date
SNP
genome.
We
generated
updated
annotation
223,780
Furthermore,
we
identified
variable
TEs
affected
sweeps.
Combining
three
different
approaches,
were
predicted
set
candidate
genes.
A
small
percentage
under
selection
(1.6%)
shared
among
pools,
suggesting
that
avenues
both
Up
25%
with
previously
reported
common
also
detected
bean.
built
catalog
39,459
presence-absence
(PAV).
fact
75%
located
close
shows
their
potential
affect
functions
structure
inferred
was
consistent
obtained
markers,
TE
demographic
history
its
adaptive
processes,
particular
during
domestication.
Nature Genetics,
Journal Year:
2024,
Volume and Issue:
56(4), P. 721 - 731
Published: April 1, 2024
Abstract
Coffea
arabica
,
an
allotetraploid
hybrid
of
eugenioides
and
canephora
is
the
source
approximately
60%
coffee
products
worldwide,
its
cultivated
accessions
have
undergone
several
population
bottlenecks.
We
present
chromosome-level
assemblies
a
di-haploid
C.
accession
modern
representatives
diploid
progenitors,
.
The
three
species
exhibit
largely
conserved
genome
structures
between
parents
descendant
subgenomes,
with
no
obvious
global
subgenome
dominance.
find
evidence
for
founding
polyploidy
event
350,000–610,000
years
ago,
followed
by
pre-domestication
bottlenecks,
resulting
in
narrow
genetic
variation.
A
split
wild
cultivar
progenitors
occurred
~30.5
thousand
period
migration
two
populations.
Analysis
varieties,
including
lines
historically
introgressed
highlights
their
breeding
histories
loci
that
may
contribute
to
pathogen
resistance,
laying
groundwork
future
genomics-based
Trends in Genetics,
Journal Year:
2024,
Volume and Issue:
40(10), P. 891 - 908
Published: Aug. 7, 2024
Harnessing
cutting-edge
technologies
to
enhance
crop
productivity
is
a
pivotal
goal
in
modern
plant
breeding.
Artificial
intelligence
(AI)
renowned
for
its
prowess
big
data
analysis
and
pattern
recognition,
revolutionizing
numerous
scientific
domains
including
We
explore
the
wider
potential
of
AI
tools
various
facets
breeding,
collection,
unlocking
genetic
diversity
within
genebanks,
bridging
genotype–phenotype
gap
facilitate
This
will
enable
development
cultivars
tailored
projected
future
environments.
Moreover,
also
hold
promise
refining
traits
by
improving
precision
gene-editing
systems
predicting
effects
gene
variants
on
phenotypes.
Leveraging
AI-enabled
breeding
can
augment
efficiency
programs
holds
optimizing
cropping
at
grassroots
level.
entails
identifying
optimal
inter-cropping
crop-rotation
models
agricultural
sustainability
field.
Briefings in Bioinformatics,
Journal Year:
2024,
Volume and Issue:
25(3)
Published: March 27, 2024
Following
the
milestone
success
of
Human
Genome
Project,
'Encyclopedia
DNA
Elements
(ENCODE)'
initiative
was
launched
in
2003
to
unearth
information
about
numerous
functional
elements
within
genome.
This
endeavor
coincided
with
emergence
novel
technologies,
accompanied
by
provision
vast
amounts
whole-genome
sequences,
high-throughput
data
such
as
ChIP-Seq
and
RNA-Seq.
Extracting
biologically
meaningful
from
this
massive
dataset
has
become
a
critical
aspect
many
recent
studies,
particularly
annotating
predicting
functions
unknown
genes.
The
core
idea
behind
genome
annotation
is
identify
genes
various
sequence
infer
their
biological
functions.
Traditional
wet-lab
experimental
methods
still
rely
on
extensive
efforts
for
verification.
However,
early
bioinformatics
algorithms
software
primarily
employed
shallow
learning
techniques;
thus,
ability
characterize
features
limited.
With
widespread
adoption
RNA-Seq
technology,
scientists
community
began
harness
potential
machine
deep
approaches
gene
structure
prediction
annotation.
In
context,
we
reviewed
both
conventional
contemporary
frameworks,
highlighted
perspectives
challenges
arising
during
underscoring
dynamic
nature
evolving
scientific
landscape.
NAR Genomics and Bioinformatics,
Journal Year:
2024,
Volume and Issue:
6(3)
Published: July 2, 2024
Long
terminal
repeat
(LTR)
retrotransposons
constitute
a
predominant
class
of
repetitive
DNA
elements
in
most
plant
genomes.
With
the
increasing
number
sequenced
genomes,
there
is
an
ongoing
demand
for
computational
tools
facilitating
efficient
annotation
and
classification
LTR
genome
assemblies.
Herein,
we
introduce
DANTE,
pipeline
Domain-based
ANnotation
Transposable
Elements,
designed
sensitive
detection
these
via
their
conserved
protein
domain
sequences.
The
identified
domains
are
subsequently
inputted
into
DANTE_LTR
to
annotate
complete
element
sequences
by
detecting
structural
features,
such
as
LTRs,
adjacent
genomic
regions.
Leveraging
allows
precise
phylogenetic
lineages,
offering
more
granular
compared
with
coarser
conventional
superfamily-based
methods.
efficiency
accuracy
this
approach
were
evidenced
93
Results
benchmarked
against
several
established
pipelines,
showing
that
capable
identifying
significantly
intact
retrotransposons.
DANTE
provided
user-friendly
Galaxy
accessible
public
server
(https://repeatexplorer-elixir.cerit-sc.cz),
installable
on
local
instances
from
tool
shed
or
executable
command
line.
BMC Biology,
Journal Year:
2024,
Volume and Issue:
22(1)
Published: Aug. 7, 2024
White
clover
(Trifolium
repens)
is
a
globally
important
perennial
forage
legume.
This
species
also
serves
as
an
eco-evolutionary
model
system
for
studying
within-species
chemical
defense
variation;
it
features
well-studied
polymorphism
cyanogenesis
(HCN
release
following
tissue
damage),
with
higher
frequencies
of
cyanogenic
plants
favored
in
warmer
locations
worldwide.
Using
newly
generated
haplotype-resolved
genome
and
two
other
long-read
assemblies,
we
tested
the
hypothesis
that
copy
number
variants
(CNVs)
at
genes
play
role
ability
white
to
rapidly
adapt
local
environments.
We
examined
questions
on
subgenome
evolution
this
recently
evolved
allotetraploid
chromosomal
rearrangements
broader
IRLC
legume
clade.
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2023,
Volume and Issue:
unknown
Published: Oct. 16, 2023
Abstract
Motivation
Transposable
elements
(TEs)
are
interspersed
repetitive
sequences
that
major
constituents
of
most
eukaryotic
genomes
and
crucial
for
genome
evolution.
Despite
the
existence
multiple
tools
their
classification
annotation,
none
them
can
achieve
completely
reliable
results
making
it
a
challenge
genomic
studies.
In
this
work,
we
introduce
TEclass2,
new
software
uses
deep
learning
approach
based
upon
linear
Transformer
architecture
with
k-mer
to-kenizer
further
adaptations
to
handle
DNA
sequences.
This
has
an
easy
configuration
allows
training
models
on
datasets
TE
providing
metrics
evaluation
results.
Results
work
shows
successful
adaptation
Transformers
from
consensus
sequences,
these
lay
foundation
novel
methodologies
in
bioinformatics.
We
provide
tool
custom
data
web
page
interface
pre-trained
dataset
curated
non-curated
libraries
allowing
fast
simple
TEs.
Availability
https://bioinformatics.uni-muenster.de/tools/teclass2/index.pl
Contact
[email protected]
Supplementary
information
available
at
Bioinformatics
online.
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2024,
Volume and Issue:
unknown
Published: Jan. 31, 2024
Abstract
Transposable
Elements
(TEs)
are
abundant
repeat
sequences
found
in
living
organisms.
They
play
a
pivotal
role
biological
evolution
and
gene
regulation
intimately
linked
to
human
diseases.
Existing
TE
classification
tools
can
classify
classes,
orders,
superfamilies
concurrently,
but
they
often
struggle
effectively
extract
sequence
features.
This
limitation
frequently
results
subpar
results,
especially
hierarchical
classification.
To
tackle
this
problem,
we
introduced
BERTE,
tool
for
BERTE
encoded
into
distinctive
features
that
consisted
of
both
attentional
cumulative
k-mer
frequency
information.
By
leveraging
the
multi-head
self-attention
mechanism
pre-trained
BERT
model,
transformed
Additionally,
calculated
multiple
vectors
concatenate
them
form
Following
feature
extraction,
parallel
Convolutional
Neural
Network
(CNN)
model
was
employed
as
an
efficient
classifier,
capitalizing
on
its
capability
high-dimensional
transformation.
We
evaluated
BERTE’s
performance
filtered
datasets
collected
from
12
eukaryotic
databases.
Experimental
demonstrated
could
improve
F1-score
at
different
levels
by
up
21%
compared
current
state-of-the-art
methods.
Furthermore,
indicated
not
only
better
characterize
also
CNN
more
than
other
popular
deep
learning
classifiers.
In
general,
classifies
with
greater
precision.
is
available
https://github.com/yiqichen-2000/BERTE
.
Applications in Plant Sciences,
Journal Year:
2023,
Volume and Issue:
11(4)
Published: May 11, 2023
Transposable
elements
(TEs)
make
up
more
than
half
of
the
genomes
complex
plant
species
and
can
modulate
expression
neighboring
genes,
producing
significant
variability
agronomically
relevant
traits.
The
availability
long-read
sequencing
technologies
allows
building
genome
assemblies
for
with
large
genomes.
Unfortunately,
TE
annotation
currently
represents
a
bottleneck
in
assemblies.
Mobile DNA,
Journal Year:
2024,
Volume and Issue:
15(1)
Published: April 16, 2024
Plant
genomes
include
large
numbers
of
transposable
elements.
One
particular
type
these
elements
is
flanked
by
two
Long
Terminal
Repeats
(LTRs)
and
can
translocate
using
RNA.
Such
are
known
as
LTR-retrotransposons;
they
the
most
abundant
transposons
in
plant
genomes.
They
have
many
important
functions
involving
gene
regulation
rise
new
genes
pseudo
response
to
severe
stress.
Additionally,
LTR-retrotransposons
several
applications
biotechnology.
Due
abundance
importance
LTR-retrotransposons,
multiple
computational
tools
been
developed
for
their
detection.
However,
none
take
advantages
availability
related
genomes;
process
one
chromosome
at
a
time.
Further,
recently
nested
(multiple
same
family
inserted
into
each
other)
cannot
be
annotated
accurately
-
or
all
currently
available
tools.
Motivated
overcome
limitations,
we
built
Look4LTRs,
which
annotate
simultaneously
discover
The
methodology
Look4LTRs
depends
on
techniques
imported
from
signal-processing
field,
graph
algorithms,
machine
learning
with
minimal
use
alignment
algorithms.
Four
were
used
developing
eight
evaluating
it
contrast
three
fastest
while
maintaining
better
comparable
F1
scores
(the
harmonic
average
recall
precision)
those
obtained
other
Our
results
demonstrate
added
benefit
annotating
ability
Expert
human
manual
examination
six
not
included
ground
truth
revealed
that
belong
families
likely
families.
With
respect
examining
out
five
confirmed
valid
its
speed,
accuracy,
novel
features
represents
true
advancement
annotation
opening
door
studies
focused
understanding
plants.