Nature Genetics,
Journal Year:
2024,
Volume and Issue:
56(6), P. 1134 - 1146
Published: May 28, 2024
The
functional
impact
and
cellular
context
of
mosaic
structural
variants
(mSVs)
in
normal
tissues
is
understudied.
Utilizing
Strand-seq,
we
sequenced
1,133
single-cell
genomes
from
19
human
donors
increasing
age,
discovered
the
heterogeneous
mSV
landscapes
hematopoietic
stem
progenitor
cells.
While
mSVs
are
continuously
acquired
throughout
life,
expanded
subclones
our
cohort
confined
to
individuals
>60.
Cells
already
harboring
more
likely
acquire
additional
somatic
variants,
including
megabase-scale
segmental
aneuploidies.
Capitalizing
on
comprehensive
micrococcal
nuclease
digestion
with
sequencing
reference
data,
conducted
high-resolution
cell-typing
for
eight
Clonally
disrupt
function
by
dysregulating
diverse
pathways,
enriching
myeloid
progenitors.
Our
findings
underscore
contribution
molecular
phenotypes
associated
aging
system,
establish
a
foundation
deciphering
links
between
mSVs,
disease
susceptibility
tissues.
Nature Ecology & Evolution,
Journal Year:
2022,
Volume and Issue:
6(12), P. 1965 - 1979
Published: Oct. 17, 2022
Abstract
Chromosomal
inversions
are
an
important
form
of
structural
variation
that
can
affect
recombination,
chromosome
structure
and
fitness.
However,
because
be
challenging
to
detect,
the
prevalence
hence
significance
segregating
within
species
remains
largely
unknown,
especially
in
natural
populations
mammals.
Here,
by
combining
population-genomic
long-read
sequencing
analyses
a
single,
widespread
deer
mouse
(
Peromyscus
maniculatus
),
we
identified
21
polymorphic
large
(1.5–43.8
Mb)
cause
near-complete
suppression
recombination
when
heterozygous
(0–0.03
cM
Mb
−1
).
We
found
inversion
breakpoints
frequently
occur
centromeric
telomeric
regions
often
flanked
long
inverted
repeats
(0.5–50
kb),
suggesting
they
probably
arose
via
ectopic
recombination.
By
genotyping
across
species’
range,
do
not
harbour
deleterious
mutational
loads,
many
likely
maintained
as
polymorphisms
divergent
selection.
Comparisons
forest
prairie
ecotypes
mice
revealed
13
contribute
differentiation
between
populations,
which
five
exhibit
significant
associations
with
traits
implicated
local
adaptation.
Taken
together,
these
results
show
have
impact
on
genome
genetic
diversity
facilitate
adaptation
range
this
species.
Cell,
Journal Year:
2024,
Volume and Issue:
187(6), P. 1547 - 1562.e13
Published: Feb. 29, 2024
We
sequenced
and
assembled
using
multiple
long-read
sequencing
technologies
the
genomes
of
chimpanzee,
bonobo,
gorilla,
orangutan,
gibbon,
macaque,
owl
monkey,
marmoset.
identified
1,338,997
lineage-specific
fixed
structural
variants
(SVs)
disrupting
1,561
protein-coding
genes
136,932
regulatory
elements,
including
most
complete
set
human-specific
differences.
estimate
that
819.47
Mbp
or
∼27%
genome
has
been
affected
by
SVs
across
primate
evolution.
identify
1,607
structurally
divergent
regions
wherein
recurrent
variation
contributes
to
creating
SV
hotspots
where
are
recurrently
lost
(e.g.,
CARD,
C4,
OLAH
gene
families)
additional
generated
CKAP2,
VPS36,
ACBD7,
NEK5
paralogs),
becoming
targets
rapid
chromosomal
diversification
positive
selection
RGPD
family).
High-fidelity
made
these
dynamic
accessible
for
sequence-level
analyses
within
between
species.
Nature Genetics,
Journal Year:
2025,
Volume and Issue:
unknown
Published: Jan. 8, 2025
Abstract
Segmental
duplications
(SDs)
contribute
significantly
to
human
disease,
evolution
and
diversity
but
have
been
difficult
resolve
at
the
sequence
level.
We
present
a
population
genetics
survey
of
SDs
by
analyzing
170
genome
assemblies
(from
85
samples
representing
38
Africans
47
non-Africans)
in
which
majority
autosomal
are
fully
resolved
using
long-read
assembly.
Excluding
acrocentric
short
arms
sex
chromosomes,
we
identify
173.2
Mb
duplicated
(47.4
not
telomere-to-telomere
reference)
distinguishing
fixed
from
structurally
polymorphic
events.
find
that
intrachromosomal
among
most
variable,
with
rare
events
mapping
near
their
progenitor
sequences.
African
genomes
harbor
more
likely
recently
gene
families
higher
copy
numbers
than
non-African
samples.
Comparison
resource
563
million
full-length
isoform
sequencing
reads
identifies
201
novel,
potentially
protein-coding
genes
corresponding
these
number
SDs.
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2022,
Volume and Issue:
unknown
Published: June 26, 2022
Abstract
The
Telomere-to-Telomere
consortium
recently
assembled
the
first
truly
complete
sequence
of
a
human
genome.
To
resolve
most
complex
repeats,
this
project
relied
on
manual
integration
ultra-long
Oxford
Nanopore
sequencing
reads
with
high-resolution
assembly
graph
built
from
long,
accurate
PacBio
HiFi
reads.
We
have
improved
and
automated
strategy
in
Verkko,
an
iterative,
graph-based
pipeline
for
assembling
complete,
diploid
genomes.
Verkko
begins
multiplex
de
Bruijn
progressively
simplifies
via
haplotype-specific
markers.
result
is
phased,
both
haplotypes,
many
chromosomes
automatically
telomere
to
telomere.
Running
HG002
genome
resulted
20
46
without
gaps
at
99.9997%
accuracy.
genomes
critical
step
towards
construction
comprehensive
pangenome
databases
chromosome-scale
comparative
genomics.
Genome Research,
Journal Year:
2023,
Volume and Issue:
33(4), P. 496 - 510
Published: April 1, 2023
There
has
been
tremendous
progress
in
phased
genome
assembly
production
by
combining
long-read
data
with
parental
information
or
linked-read
data.
Nevertheless,
a
typical
generated
trio-hifiasm
still
generates
more
than
140
gaps.
We
perform
detailed
analysis
of
gaps,
breaks,
and
misorientations
from
182
haploid
assemblies
obtained
diversity
panel
77
unique
human
samples.
Although
trio-based
approaches
using
HiFi
are
the
current
gold
standard,
chromosome-wide
phasing
accuracy
is
comparable
when
Strand-seq
instead
Importantly,
majority
gaps
cluster
near
largest
most
identical
repeats
(including
segmental
duplications
[35.4%],
satellite
DNA
[22.3%],
regions
enriched
GA/AT-rich
[27.4%]).
Consequently,
1513
protein-coding
genes
overlap
at
least
one
haplotype,
231
recurrently
disrupted
missing
five
haplotypes.
Furthermore,
we
estimate
that
6–7
Mbp
misorientated
per
haplotype
irrespective
whether
trio-free
used.
Of
these
misorientations,
81%
correspond
to
bona
fide
large
inversion
polymorphisms
species,
which
flanked
duplications.
also
identify
large-scale
alignment
discontinuities
consistent
11.9
deletions
161.4
insertions
genome.
99%
this
variation
corresponds
DNA,
230
euchromatic
frequent
expansions
contractions,
nearly
half
197
genes.
Such
variable
incompletely
assembled
important
targets
for
future
algorithmic
development
pangenome
representation.