Scientific Data,
Journal Year:
2024,
Volume and Issue:
11(1)
Published: March 8, 2024
Abstract
Wild
germplasm
resources
are
crucial
for
gene
mining
and
molecular
breeding
because
of
their
special
trait
performance.
Haplotype-resolved
genome
is
an
ideal
solution
fully
understanding
the
biology
subgenomes
in
highly
heterozygous
species.
Here,
we
surveyed
a
wild
walnut
tree
from
Gongliu
County,
Xinjiang,
China,
generated
haplotype-resolved
reference
562.99
Mb
(contig
N50
=
34.10
Mb)
one
haplotype
(hap1)
561.07
33.91
another
(hap2)
using
PacBio
high-fidelity
(HiFi)
reads
Hi-C
technology.
Approximately
527.20
(93.64%)
hap1
526.40
(93.82%)
hap2
were
assigned
to
16
pseudochromosomes.
A
total
41039
39744
protein-coding
models
predicted
hap2,
respectively.
Moreover,
123
structural
variations
(SVs)
identified
between
two
genomes.
Allele-specific
expression
genes
(ASEGs)
that
respond
cold
stress
ultimately
identified.
These
datasets
can
be
used
study
subgenome
evolution,
functional
elite
discover
transcriptional
basis
specific
traits
related
environmental
adaptation
walnut.
Genome biology,
Journal Year:
2019,
Volume and Issue:
20(1)
Published: Dec. 16, 2019
Abstract
Background
Sequencing
technology
and
assembly
algorithms
have
matured
to
the
point
that
high-quality
de
novo
is
possible
for
large,
repetitive
genomes.
Current
assemblies
traverse
transposable
elements
(TEs)
provide
an
opportunity
comprehensive
annotation
of
TEs.
Numerous
methods
exist
each
class
TEs,
but
their
relative
performances
not
been
systematically
compared.
Moreover,
a
pipeline
needed
produce
non-redundant
library
TEs
species
lacking
this
resource
generate
whole-genome
TE
annotations.
Results
We
benchmark
existing
programs
based
on
carefully
curated
rice
evaluate
performance
annotating
long
terminal
repeat
(LTR)
retrotransposons,
inverted
(TIR)
transposons,
short
TIR
transposons
known
as
miniature
(MITEs),
Helitrons.
Performance
metrics
include
sensitivity,
specificity,
accuracy,
precision,
FDR,
F
1
.
Using
most
robust
programs,
we
create
called
Extensive
de-novo
Annotator
(EDTA)
produces
filtered
structurally
intact
fragmented
elements.
EDTA
also
deconvolutes
nested
insertions
frequently
found
in
highly
genomic
regions.
other
model
with
libraries
(maize
Drosophila),
shown
be
across
both
plant
animal
species.
Conclusions
The
benchmarking
results
developed
here
will
greatly
facilitate
eukaryotic
These
annotations
promote
much
more
in-depth
understanding
diversity
evolution
at
intra-
inter-species
levels.
open-source
freely
available:
https://github.com/oushujun/EDTA
Science,
Journal Year:
2021,
Volume and Issue:
373(6555), P. 655 - 662
Published: Aug. 5, 2021
We
report
de
novo
genome
assemblies,
transcriptomes,
annotations,
and
methylomes
for
the
26
inbreds
that
serve
as
founders
maize
nested
association
mapping
population.
The
number
of
pan-genes
in
these
diverse
genomes
exceeds
103,000,
with
approximately
a
third
found
across
all
genotypes.
results
demonstrate
ancient
tetraploid
character
continues
to
degrade
by
fractionation
present
day.
Excellent
contiguity
over
repeat
arrays
complete
annotation
centromeres
revealed
additional
variation
major
cytological
landmarks.
show
combining
structural
single-nucleotide
polymorphisms
can
improve
power
quantitative
studies.
also
document
at
level
DNA
methylation
unmethylated
regions
are
enriched
cis-regulatory
elements
contribute
phenotypic
variation.
F1000Research,
Journal Year:
2025,
Volume and Issue:
unknown
Published: Feb. 19, 2025
The
Madagascar
periwinkle,
Catharanthus
roseus,
belongs
to
the
Apocynaceae
family.This
medicinal
plant,
endemic
Madagascar,
produces
many
important
drugs
including
monoterpene
indole
alkaloids
(MIA)
vincristine
and
vinblastine
used
treat
cancer
worldwide.Here,
we
provide
a
new
version
of
C.
roseus
genome
sequence
obtained
through
combination
Oxford
Nanopore
Technologies
long-reads
Illumina
short-reads.This
more
contiguous
assembly
consists
173
scaffolds
with
total
length
581.128Mb
an
N50
12.241
Mb.Using
publicly
available
RNAseq
data,
21,061
protein
coding
genes
were
predicted
functionally
annotated.A
42.87%
was
annotated
as
transposable
elements,
most
them
being
long-terminal
repeats.Together
increasing
access
MIA-producing
plant
genomes,
this
updated
should
ease
evolutionary
studies
leading
better
understanding
MIA
biosynthetic
pathway
evolution.
Cell Research,
Journal Year:
2022,
Volume and Issue:
32(10), P. 878 - 896
Published: July 12, 2022
Pan-genomes
from
large
natural
populations
can
capture
genetic
diversity
and
reveal
genomic
complexity.
Using
de
novo
long-read
assembly,
we
generated
a
graph-based
super
pan-genome
of
rice
consisting
251-accession
panel
comprising
both
cultivated
wild
species
Asian
African
rice.
Our
reveals
extensive
structural
variations
(SVs)
gene
presence/absence
variations.
Additionally,
our
enables
the
accurate
identification
nucleotide-binding
leucine-rich
repeat
genes
characterization
their
inter-
intraspecific
diversity.
Moreover,
uncovered
grain
weight-associated
SVs
which
specify
traits
by
affecting
expression
nearby
genes.
We
characterized
variants
associated
with
submergence
tolerance,
seed
shattering
plant
architecture
found
independent
selection
for
common
set
that
drove
adaptation
domestication
in
This
facilitates
pinpointing
lineage-specific
haplotypes
trait-associated
provides
insights
into
evolutionary
events
have
shaped
various
species.
Nature Genetics,
Journal Year:
2024,
Volume and Issue:
56(4), P. 710 - 720
Published: March 15, 2024
Abstract
Polyploidy
(genome
duplication)
is
a
pivotal
force
in
evolution.
However,
the
interactions
between
parental
genomes
polyploid
nucleus,
frequently
involving
subgenome
dominance,
are
poorly
understood.
Here
we
showcase
analyses
of
bamboo
system
(Poaceae:
Bambusoideae)
comprising
series
lineages
from
diploid
(herbaceous)
to
tetraploid
and
hexaploid
(woody),
with
11
chromosome-level
de
novo
genome
assemblies
476
transcriptome
samples.
We
find
that
woody
subgenomes
exhibit
stunning
karyotype
stability,
parallel
dominance
two
clades
gradual
shift
clade.
Allopolyploidization
have
shaped
evolution
tree-like
lignified
culms,
rapid
growth
synchronous
flowering
characteristic
bamboos
as
large
grasses.
Our
work
provides
insights
into
remarkable
system,
including
its
dependence
on
genomic
context
ability
switch
which
dominant
over
evolutionary
time.
Science,
Journal Year:
2025,
Volume and Issue:
387(6734), P. 637 - 643
Published: Feb. 6, 2025
Some
plants
have
massive
sex-linked
regions.
To
test
hypotheses
about
their
evolution,
we
sequenced
the
genome
of
Silene
latifolia
,
in
which
giant
heteromorphic
sex
chromosomes
were
first
discovered
1923.
It
has
long
been
known
that
Y
chromosome
consists
mainly
a
male-specific
region
does
not
recombine
with
X
and
carries
sex-determining
genes
other
male
functions.
However,
only
whole
assembly
can
candidate
be
validated
experimentally
locations
determined
related
to
suppression
recombination.
We
describe
genomic
changes
as
ancestral
evolved
into
current
XY
pair,
testing
ideas
evolution
large
nonrecombining
regions
mechanisms
created
present
recombination
pattern.
Genome biology,
Journal Year:
2020,
Volume and Issue:
21(1)
Published: July 6, 2020
Abstract
Background
Gene
expression
is
a
key
determinant
of
cellular
response.
Natural
variation
in
gene
bridges
genetic
to
phenotypic
alteration.
Identification
the
regulatory
variants
controlling
response
drought,
major
environmental
threat
crop
production
worldwide,
great
value
for
drought-tolerant
identification.
Results
A
total
627
RNA-seq
analyses
are
performed
224
maize
accessions
which
represent
wide
diversity
under
three
water
regimes;
73,573
eQTLs
detected
about
30,000
expressing
genes
with
high-density
genome-wide
single
nucleotide
polymorphisms,
reflecting
comprehensive
and
dynamic
architecture
drought.
The
constitutively
or
drought-dynamically
unraveled.
Focusing
on
resolved
encoding
transcription
factors,
drought-responsive
network
hierarchy
factors
their
target
built.
Moreover,
97
prioritized
associate
drought
tolerance
due
variations
through
Mendelian
randomization
analysis.
One
candidate
genes,
Abscisic
acid
8′-hydroxylase
,
verified
play
negative
role
plant
tolerance.
Conclusions
This
study
unravels
effects
dynamics
allows
us
better
understand
distal
proximal
plasticity.
drought-associated
may
serve
as
direct
targets
functional
investigation
allelic
mining.
Bioinformatics,
Journal Year:
2020,
Volume and Issue:
36(15), P. 4269 - 4275
Published: May 12, 2020
Abstract
Motivation
Transposable
elements
(TEs)
classification
is
an
essential
step
to
decode
their
roles
in
genome
evolution.
With
a
large
number
of
genomes
from
non-model
species
becoming
available,
accurate
and
efficient
TE
has
emerged
as
new
challenge
genomic
sequence
analysis.
Results
We
developed
novel
tool,
DeepTE,
which
classifies
unknown
TEs
using
convolutional
neural
networks
(CNNs).
DeepTE
transferred
sequences
into
input
vectors
based
on
k-mer
counts.
A
tree
structured
process
was
used
where
eight
models
were
trained
classify
super
families
orders.
also
detected
domains
inside
correct
false
classification.
An
additional
model
distinguish
between
non-TEs
plants.
Given
unclassified
different
species,
can
seven
orders,
include
15,
24
16
plants,
metazoans
fungi,
respectively.
In
several
benchmarking
tests,
outperformed
other
existing
tools
for
conclusion,
successfully
leverages
CNN
classification,
be
precisely
newly
sequenced
eukaryotic
genomes.
Availability
implementation
accessible
at
https://github.com/LiLabAtVT/DeepTE.
Supplementary
information
data
are
available
Bioinformatics
online.
The Plant Journal,
Journal Year:
2019,
Volume and Issue:
100(5), P. 1052 - 1065
Published: Aug. 5, 2019
Transposable
elements
(TEs)
are
ubiquitous
components
of
eukaryotic
genomes
and
can
create
variation
in
genome
organization
content.
Most
maize
composed
TEs.
We
developed
an
approach
to
define
shared
variable
TE
insertions
across
assemblies
applied
this
method
four
(B73,
W22,
Mo17
PH207)
with
uniform
structural
annotations
Among
these
we
identified
approximately
400
000
TEs
that
polymorphic,
encompassing
1.6
Gb
sequence.
These
polymorphic
include
a
combination
recent
transposition
events
as
well
deletions
older
There
examples
within
each
the
superfamilies
they
found
distributed
genome,
including
regions
ancestry
among
individuals.
many
or
near
genes.
In
addition,
there
2380
gene
B73
located
TEs,
providing
evidence
for
role
contributing
substantial
differences
annotated
content
genotypes.
highly
our
survey
temperate
genomes,
highlighting
major
contribution
driving
OPEN
RESEARCH
BADGES:
This
article
has
earned
Open
Data
Badge
making
publicly
available
digitally-shareable
data
necessary
reproduce
reported
results.
The
is
at
https://github.com/SNAnderson/maizeTE_variation;
https://mcstitzer.github.io/maize_TEs.