Genome Research,
Journal Year:
2021,
Volume and Issue:
32(1), P. 1 - 27
Published: Dec. 29, 2021
Expansions
of
gene-specific
DNA
tandem
repeats
(TRs),
first
described
in
1991
as
a
disease-causing
mutation
humans,
are
now
known
to
cause
>60
phenotypes,
not
just
disease,
and
only
humans.
TRs
common
form
genetic
variation
with
biological
consequences,
observed,
so
far,
dogs,
plants,
oysters,
yeast.
Repeat
diseases
show
atypical
clinical
features,
anticipation,
multiple
partially
penetrant
phenotypes
among
family
members.
Discovery
repeat
expansion
loci
accelerated
through
technological
advances
sequencing
computational
analyses.
Between
2019
2021,
17
new
TR
expansions
were
reported,
totaling
63
(>69
diseases),
likelihood
more
discoveries,
organisms.
Recent
historical
lessons
reveal
that
properly
assessed
presentations,
coupled
awareness,
can
guide
discovery
unstable
TRs.
We
highlight
critical
but
underrecognized
aspects
mutations.
motifs
may
be
present
current
reference
genomes
will
forthcoming
gapless
long-read
references.
motif
size
single
nucleotide
kilobases/unit.
At
given
locus,
sequence
purity
vary
consequence.
Pathogenic
“insertions”
within
nonpathogenic
Expansions,
contractions,
somatic
length
variations
have
clinical/biological
consequences.
instabilities
occur
humans
other
epigenetically
modified
and/or
chromosomal
fragile
sites.
discuss
the
expanding
field
disease-associated
instabilities,
highlighting
prospects,
clues,
tools,
challenges
for
further
discoveries
understanding
their
pathological
impacts—a
vista
is
about
expand.
Science,
Journal Year:
2022,
Volume and Issue:
376(6588), P. 44 - 53
Published: March 31, 2022
Since
its
initial
release
in
2000,
the
human
reference
genome
has
covered
only
euchromatic
fraction
of
genome,
leaving
important
heterochromatic
regions
unfinished.
Addressing
remaining
8%
Telomere-to-Telomere
(T2T)
Consortium
presents
a
complete
3.055
billion–base
pair
sequence
T2T-CHM13,
that
includes
gapless
assemblies
for
all
chromosomes
except
Y,
corrects
errors
prior
references,
and
introduces
nearly
200
million
base
pairs
containing
1956
gene
predictions,
99
which
are
predicted
to
be
protein
coding.
The
completed
include
centromeric
satellite
arrays,
recent
segmental
duplications,
short
arms
five
acrocentric
chromosomes,
unlocking
these
complex
variational
functional
studies.
Science,
Journal Year:
2022,
Volume and Issue:
376(6588)
Published: March 31, 2022
Existing
human
genome
assemblies
have
almost
entirely
excluded
repetitive
sequences
within
and
near
centromeres,
limiting
our
understanding
of
their
organization,
evolution,
functions,
which
include
facilitating
proper
chromosome
segregation.
Now,
a
complete,
telomere-to-telomere
assembly
(T2T-CHM13)
has
enabled
us
to
comprehensively
characterize
pericentromeric
centromeric
repeats,
constitute
6.2%
the
(189.9
megabases).
Detailed
maps
these
regions
revealed
multimegabase
structural
rearrangements,
including
in
active
repeat
arrays.
Analysis
centromere-associated
uncovered
strong
relationship
between
position
centromere
evolution
surrounding
DNA
through
layered
expansions.
Furthermore,
comparisons
X
centromeres
across
diverse
panel
individuals
illuminated
high
degrees
structural,
epigenetic,
sequence
variation
complex
rapidly
evolving
regions.
Science,
Journal Year:
2022,
Volume and Issue:
376(6588)
Published: March 31, 2022
Compared
to
its
predecessors,
the
Telomere-to-Telomere
CHM13
genome
adds
nearly
200
million
base
pairs
of
sequence,
corrects
thousands
structural
errors,
and
unlocks
most
complex
regions
human
for
clinical
functional
study.
We
show
how
this
reference
universally
improves
read
mapping
variant
calling
3202
17
globally
diverse
samples
sequenced
with
short
long
reads,
respectively.
identify
hundreds
variants
per
sample
in
previously
unresolved
regions,
showcasing
promise
T2T-CHM13
evolutionary
biomedical
discovery.
Simultaneously,
eliminates
tens
spurious
sample,
including
reduction
false
positives
269
medically
relevant
genes
by
up
a
factor
12.
Because
these
improvements
discovery
coupled
population
genomic
resources,
is
positioned
replace
GRCh38
as
prevailing
genetics.
Science,
Journal Year:
2022,
Volume and Issue:
376(6588)
Published: March 31, 2022
Mobile
elements
and
repetitive
genomic
regions
are
sources
of
lineage-specific
innovation
uniquely
fingerprint
individual
genomes.
Comprehensive
analyses
such
repeat
elements,
including
those
found
in
more
complex
the
genome,
require
a
complete,
linear
genome
assembly.
We
present
de
novo
discovery
annotation
T2T-CHM13
human
reference
genome.
identified
previously
unknown
satellite
arrays,
expanded
catalog
variants
families
for
repeats
mobile
characterized
classes
composite
repeats,
located
retroelement
transduction
events.
detected
nascent
transcription
delineated
CpG
methylation
profiles
to
define
structure
transcriptionally
active
retroelements
humans,
centromeres.
These
data
expand
our
insight
into
diversity,
distribution,
evolution
that
have
shaped
Science,
Journal Year:
2022,
Volume and Issue:
376(6588)
Published: March 31, 2022
The
completion
of
a
telomere-to-telomere
human
reference
genome,
T2T-CHM13,
has
resolved
complex
regions
the
including
repetitive
and
homologous
regions.
Here,
we
present
high-resolution
epigenetic
study
previously
unresolved
sequences,
representing
entire
acrocentric
chromosome
short
arms,
gene
family
expansions,
diverse
collection
repeat
classes.
This
resource
precisely
maps
CpG
methylation
(32.28
million
CpGs),
DNA
accessibility,
short-read
datasets
(166,058
chromatin
immunoprecipitation
sequencing
peaks)
to
provide
evidence
activity
across
unidentified
or
corrected
genes
reveals
clinically
relevant
paralog-specific
regulation.
Probing
centromeres
from
six
individuals
generated
an
estimate
variability
in
kinetochore
localization.
analysis
provides
framework
with
which
investigate
most
elusive
granting
insights
into
Nature,
Journal Year:
2023,
Volume and Issue:
617(7960), P. 335 - 343
Published: May 10, 2023
Abstract
The
short
arms
of
the
human
acrocentric
chromosomes
13,
14,
15,
21
and
22
(SAACs)
share
large
homologous
regions,
including
ribosomal
DNA
repeats
extended
segmental
duplications
1,2
.
Although
resolution
these
regions
in
first
complete
assembly
a
genome—the
Telomere-to-Telomere
Consortium’s
CHM13
(T2T-CHM13)—provided
model
their
homology
3
,
it
remained
unclear
whether
patterns
were
ancestral
or
maintained
by
ongoing
recombination
exchange.
Here
we
show
that
contain
pseudo-homologous
(PHRs)
indicative
between
non-homologous
sequences.
Utilizing
an
all-to-all
comparison
pangenome
from
Human
Pangenome
Reference
Consortium
4
(HPRC),
find
contigs
all
SAACs
form
community.
A
variation
graph
5
constructed
centromere-spanning
indicates
presence
which
most
appear
nearly
identical
heterologous
T2T-CHM13.
Except
on
chromosome
observe
faster
decay
linkage
disequilibrium
than
corresponding
long
arms,
indicating
higher
rates
6,7
include
sequences
have
previously
been
shown
to
lie
at
breakpoint
Robertsonian
translocations
8
arrangement
is
compatible
with
crossover
inverted
14
21.
ubiquity
signals
seen
HPRC
draft
suggests
shared
basis
for
recurrent
translocations,
providing
sequence
population-based
confirmation
hypotheses
developed
cytogenetic
studies
50
years
ago
9
Annual Review of Genomics and Human Genetics,
Journal Year:
2023,
Volume and Issue:
24(1), P. 109 - 132
Published: April 19, 2023
DNA
sequencing
has
revolutionized
medicine
over
recent
decades.
However,
analysis
of
large
structural
variation
and
repetitive
DNA,
a
hallmark
human
genomes,
been
limited
by
short-read
technology,
with
read
lengths
100-300
bp.
Long-read
(LRS)
permits
routine
fragments
tens
to
hundreds
kilobase
pairs
in
size,
using
both
real-time
synthesis
nanopore-based
direct
electronic
sequencing.
LRS
haplotypic
phasing
genomes
enabled
the
discovery
characterization
rare
pathogenic
variants
repeat
expansions.
It
also
recently
assembly
complete,
gapless
genome
that
includes
previously
intractable
regions,
such
as
highly
centromeres
homologous
acrocentric
short
arms.
With
addition
protocols
for
targeted
enrichment,
epigenetic
modification
detection,
long-range
chromatin
profiling,
promises
launch
new
era
understanding
genetic
diversity
mutations
populations.