bioRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2024,
Номер
unknown
Опубликована: Авг. 9, 2024
Abstract
Short
tandem
repeats
(STRs)
are
widespread,
dynamic
repetitive
elements
with
a
number
of
biological
functions
and
relevance
to
human
diseases.
However,
their
prevalence
across
taxa
remains
poorly
characterized.
Here
we
examined
the
impact
STRs
in
genomes
117,253
organisms
spanning
tree
life.
We
find
that
there
large
differences
frequencies
between
organismal
these
largely
driven
by
taxonomic
group
an
organism
belongs
to.
Using
simulated
genomes,
on
average
is
no
enrichment
bacterial
archaeal
suggesting
not
particularly
repetitive.
In
contrast,
eukaryotic
orders
magnitude
more
than
expected.
preferentially
located
at
functional
loci
specific
taxa.
Finally,
utilize
recently
completed
Telomere-to-Telomere
other
great
apes,
highly
abundant
variable
primate
species,
peri/centromeric
regions.
conclude
have
expanded
viral
lineages
archaea
or
bacteria,
resulting
discrepancies
genomic
composition.
bioRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2023,
Номер
unknown
Опубликована: Май 8, 2023
Tools
for
genotyping
tandem
repeats
(TRs)
from
short
read
sequencing
data
have
improved
significantly
over
the
past
decade.
Extensive
comparisons
of
these
tools
to
gold
standard
diagnostic
methods
like
RP-PCR
confirmed
their
accuracy
tens
hundreds
well-studied
loci.
However,
a
scarcity
high-quality
orthogonal
truth
limited
our
ability
measure
tool
millions
other
loci
throughout
genome.
To
address
this,
we
developed
TR
set
based
on
Synthetic
Diploid
Benchmark
(SynDip).
By
identifying
subset
insertions
and
deletions
that
represent
expansions
or
contractions
with
motifs
between
2
50
base
pairs,
obtained
accurate
genotypes
139,795
pure
6,845
interrupted
in
single
diploid
sample.
Our
approach
did
not
require
running
existing
long
provided
an
alternative,
more
view
repeat
variation.
We
applied
this
compare
strengths
weaknesses
widely-used
TRs,
evaluated
completeness
genome-wide
catalogs,
explored
properties
variation
found
that,
without
filtering,
ExpansionHunter
had
higher
than
GangSTR
HipSTR
wide
range
allele
sizes.
Also,
when
errors
size
occurred,
tended
overestimate
expansion
sizes,
while
underestimate
them.
Additionally,
saw
catalogs
miss
16%
41%
variant
set.
These
results
suggest
analyses
would
benefit
larger
as
well
further
development
builds
current
algorithms.
end,
new
catalog
2.8
million
captures
95%
set,
created
modified
version
runs
3x
faster
original
producing
same
output.
medRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2024,
Номер
unknown
Опубликована: Май 21, 2024
Abstract
Approximately
3%
of
the
human
genome
consists
repetitive
elements
called
tandem
repeats
(TRs),
which
include
short
(STRs)
1–6bp
motifs
and
variable
number
(VNTRs)
7+bp
motifs.
TR
variants
contribute
to
several
dozen
mono-
polygenic
diseases
but
remain
understudied
“enigmatic,”
particularly
relative
single
nucleotide
variants.
It
remains
comparatively
challenging
interpret
clinical
significance
Although
existing
resources
provide
portions
necessary
data
for
interpretation
at
disease-associated
loci,
it
is
currently
difficult
or
impossible
efficiently
invoke
additional
details
critical
proper
interpretation,
such
as
motif
pathogenicity,
disease
penetrance,
age
onset
distributions.
also
often
unclear
how
apply
population
information
analyses.
We
present
STRchive
(S-T-archive,
http://strchive.org/
),
a
dynamic
resource
consolidating
on
loci
in
humans
from
research
literature,
up-to-date
resources,
large-scale
genomic
databases,
with
goal
streamlining
variant
loci.
—including
pathogenic
thresholds,
classification,
phenotypes—to
gnomAD
cohort
∼18.5k
individuals
genotyped
60
Through
detailed
literature
curation,
we
demonstrate
that
majority
affect
children
despite
being
thought
adult
diseases.
Additionally,
show
genotypes
can
be
found
within
do
not
necessarily
overlap
known
prevalence,
leverage
locus-specific
findings
therein.
diagnostic
blueprint
empowered
by
relevant
vignettes,
highlighting
possible
pitfalls
interpretation.
As
living
resource,
maintained
experts,
takes
community
contributions,
will
evolve
understanding
progresses.
ACS Chemical Neuroscience,
Год журнала:
2024,
Номер
15(4), С. 868 - 876
Опубликована: Фев. 6, 2024
The
CAG
and
CTG
trinucleotide
repeat
expansions
cause
more
than
10
human
neurodegenerative
diseases.
Intrastrand
hairpins
formed
by
repeats
contribute
to
expansions,
establishing
them
as
potential
drug
targets.
High-resolution
structural
determination
of
poses
a
long-standing
goal
aid
development,
yet
it
has
not
been
realized
due
the
intrinsic
conformational
flexibility
repetitive
sequences.
We
herein
investigate
solution
structures
using
nuclear
magnetic
resonance
(NMR)
spectroscopy
found
that
four
with
clamping
G-C
base
pair
was
able
form
stable
hairpin
structure.
determine
first
NMR
structure
dG(CTG)
bioRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2024,
Номер
unknown
Опубликована: Авг. 9, 2024
Abstract
Short
tandem
repeats
(STRs)
are
widespread,
dynamic
repetitive
elements
with
a
number
of
biological
functions
and
relevance
to
human
diseases.
However,
their
prevalence
across
taxa
remains
poorly
characterized.
Here
we
examined
the
impact
STRs
in
genomes
117,253
organisms
spanning
tree
life.
We
find
that
there
large
differences
frequencies
between
organismal
these
largely
driven
by
taxonomic
group
an
organism
belongs
to.
Using
simulated
genomes,
on
average
is
no
enrichment
bacterial
archaeal
suggesting
not
particularly
repetitive.
In
contrast,
eukaryotic
orders
magnitude
more
than
expected.
preferentially
located
at
functional
loci
specific
taxa.
Finally,
utilize
recently
completed
Telomere-to-Telomere
other
great
apes,
highly
abundant
variable
primate
species,
peri/centromeric
regions.
conclude
have
expanded
viral
lineages
archaea
or
bacteria,
resulting
discrepancies
genomic
composition.