Nature,
Год журнала:
2024,
Номер
630(8016), С. 401 - 411
Опубликована: Май 29, 2024
Abstract
Apes
possess
two
sex
chromosomes—the
male-specific
Y
chromosome
and
the
X
chromosome,
which
is
present
in
both
males
females.
The
crucial
for
male
reproduction,
with
deletions
being
linked
to
infertility
1
.
vital
reproduction
cognition
2
Variation
mating
patterns
brain
function
among
apes
suggests
corresponding
differences
their
chromosomes.
However,
owing
repetitive
nature
incomplete
reference
assemblies,
ape
chromosomes
have
been
challenging
study.
Here,
using
methodology
developed
telomere-to-telomere
(T2T)
human
genome,
we
produced
gapless
assemblies
of
five
great
(bonobo
(
Pan
paniscus
),
chimpanzee
troglodytes
western
lowland
gorilla
Gorilla
Bornean
orangutan
Pongo
pygmaeus
)
Sumatran
abelii
))
a
lesser
(the
siamang
gibbon
Symphalangus
syndactylus
)),
untangled
intricacies
evolution.
Compared
chromosomes,
vary
greatly
size
low
alignability
high
levels
structural
rearrangements—owing
accumulation
lineage-specific
ampliconic
regions,
palindromes,
transposable
elements
satellites.
Many
genes
expand
multi-copy
families
some
evolve
under
purifying
selection.
Thus,
exhibits
dynamic
evolution,
whereas
more
stable.
Mapping
short-read
sequencing
data
these
revealed
diversity
selection
on
than
100
individual
apes.
These
are
expected
inform
evolution
conservation
genetics
non-human
apes,
all
endangered
species.
Science,
Год журнала:
2022,
Номер
376(6588), С. 44 - 53
Опубликована: Март 31, 2022
Since
its
initial
release
in
2000,
the
human
reference
genome
has
covered
only
euchromatic
fraction
of
genome,
leaving
important
heterochromatic
regions
unfinished.
Addressing
remaining
8%
Telomere-to-Telomere
(T2T)
Consortium
presents
a
complete
3.055
billion–base
pair
sequence
T2T-CHM13,
that
includes
gapless
assemblies
for
all
chromosomes
except
Y,
corrects
errors
prior
references,
and
introduces
nearly
200
million
base
pairs
containing
1956
gene
predictions,
99
which
are
predicted
to
be
protein
coding.
The
completed
include
centromeric
satellite
arrays,
recent
segmental
duplications,
short
arms
five
acrocentric
chromosomes,
unlocking
these
complex
variational
functional
studies.
Nature,
Год журнала:
2023,
Номер
617(7960), С. 312 - 324
Опубликована: Май 10, 2023
Abstract
Here
the
Human
Pangenome
Reference
Consortium
presents
a
first
draft
of
human
pangenome
reference.
The
contains
47
phased,
diploid
assemblies
from
cohort
genetically
diverse
individuals
1
.
These
cover
more
than
99%
expected
sequence
in
each
genome
and
are
accurate
at
structural
base
pair
levels.
Based
on
alignments
assemblies,
we
generate
that
captures
known
variants
haplotypes
reveals
new
alleles
structurally
complex
loci.
We
also
add
119
million
pairs
euchromatic
polymorphic
sequences
1,115
gene
duplications
relative
to
existing
reference
GRCh38.
Roughly
90
additional
derived
variation.
Using
our
analyse
short-read
data
reduced
small
variant
discovery
errors
by
34%
increased
number
detected
per
haplotype
104%
compared
with
GRCh38-based
workflows,
which
enabled
typing
vast
majority
sample.
A
closer
look
at
centromeres
Centromeres
are
key
for
anchoring
chromosomes
to
the
mitotic
spindle,
but
they
have
been
difficult
sequence
because
can
contain
many
repeating
DNA
elements.
These
repeats,
however,
carry
regularly
spaced,
distinctive
markers
of
heterogeneity
between
mostly,
not
completely,
identical
repeats.
Such
differences
aid
assembly.
Naish
et
al
.
used
ultra-long-read
sequencing
establish
a
reference
assembly
that
resolves
all
five
in
small
mustard
plant
Arabidopsis
Their
view
into
subtly
homogenized
world
reveals
retrotransposons
interrupt
centromere
organization
and
repressive
methylation
excludes
from
meiotic
crossover
repair.
Thus,
evolve
under
opposing
forces
homogenization
retrotransposon
disruption.
—PJH
Compared
to
its
predecessors,
the
Telomere-to-Telomere
CHM13
genome
adds
nearly
200
million
base
pairs
of
sequence,
corrects
thousands
structural
errors,
and
unlocks
most
complex
regions
human
for
clinical
functional
study.
We
show
how
this
reference
universally
improves
read
mapping
variant
calling
3202
17
globally
diverse
samples
sequenced
with
short
long
reads,
respectively.
identify
hundreds
variants
per
sample
in
previously
unresolved
regions,
showcasing
promise
T2T-CHM13
evolutionary
biomedical
discovery.
Simultaneously,
eliminates
tens
spurious
sample,
including
reduction
false
positives
269
medically
relevant
genes
by
up
a
factor
12.
Because
these
improvements
discovery
coupled
population
genomic
resources,
is
positioned
replace
GRCh38
as
prevailing
genetics.
Mobile
elements
and
repetitive
genomic
regions
are
sources
of
lineage-specific
innovation
uniquely
fingerprint
individual
genomes.
Comprehensive
analyses
such
repeat
elements,
including
those
found
in
more
complex
the
genome,
require
a
complete,
linear
genome
assembly.
We
present
de
novo
discovery
annotation
T2T-CHM13
human
reference
genome.
identified
previously
unknown
satellite
arrays,
expanded
catalog
variants
families
for
repeats
mobile
characterized
classes
composite
repeats,
located
retroelement
transduction
events.
detected
nascent
transcription
delineated
CpG
methylation
profiles
to
define
structure
transcriptionally
active
retroelements
humans,
centromeres.
These
data
expand
our
insight
into
diversity,
distribution,
evolution
that
have
shaped
The
completion
of
a
telomere-to-telomere
human
reference
genome,
T2T-CHM13,
has
resolved
complex
regions
the
including
repetitive
and
homologous
regions.
Here,
we
present
high-resolution
epigenetic
study
previously
unresolved
sequences,
representing
entire
acrocentric
chromosome
short
arms,
gene
family
expansions,
diverse
collection
repeat
classes.
This
resource
precisely
maps
CpG
methylation
(32.28
million
CpGs),
DNA
accessibility,
short-read
datasets
(166,058
chromatin
immunoprecipitation
sequencing
peaks)
to
provide
evidence
activity
across
unidentified
or
corrected
genes
reveals
clinically
relevant
paralog-specific
regulation.
Probing
centromeres
from
six
individuals
generated
an
estimate
variability
in
kinetochore
localization.
analysis
provides
framework
with
which
investigate
most
elusive
granting
insights
into
Long-read
sequencing
data,
particularly
those
derived
from
the
Oxford
Nanopore
platform,
tend
to
exhibit
high
error
rates.
Here,
we
present
NextDenovo,
an
efficient
correction
and
assembly
tool
for
noisy
long
reads,
which
achieves
a
level
of
accuracy
in
genome
assembly.
We
apply
NextDenovo
assemble
35
diverse
human
genomes
around
world
using
long-read
data.
These
allow
us
identify
landscape
segmental
duplication
gene
copy
number
variation
modern
populations.
The
use
should
pave
way
population-scale
bioRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2023,
Номер
unknown
Опубликована: Март 12, 2023
Abstract
Long
read
sequencing
data,
particularly
those
derived
from
the
Oxford
Nanopore
(ONT)
platform,
tend
to
exhibit
a
high
error
rate.
Here,
we
present
NextDenovo,
highly
efficient
correction
and
assembly
tool
for
noisy
long
reads,
which
achieves
level
of
accuracy
in
genome
assembly.
NextDenovo
can
rapidly
correct
reads;
these
corrected
reads
contain
fewer
errors
than
other
comparable
tools
are
characterized
by
chimeric
alignments.
We
applied
quality
reference
genomes
35
diverse
humans
across
world
using
ONT
data.
Based
on
de
novo
assemblies,
were
able
identify
landscape
segmental
duplications
gene
copy
number
variation
modern
human
population.
The
use
program
should
pave
way
population-scale
long-read
assembly,
thereby
facilitating
construction
pan-genomes,