bioRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2024,
Номер
unknown
Опубликована: Авг. 9, 2024
Abstract
Short
tandem
repeats
(STRs)
are
widespread,
dynamic
repetitive
elements
with
a
number
of
biological
functions
and
relevance
to
human
diseases.
However,
their
prevalence
across
taxa
remains
poorly
characterized.
Here
we
examined
the
impact
STRs
in
genomes
117,253
organisms
spanning
tree
life.
We
find
that
there
large
differences
frequencies
between
organismal
these
largely
driven
by
taxonomic
group
an
organism
belongs
to.
Using
simulated
genomes,
on
average
is
no
enrichment
bacterial
archaeal
suggesting
not
particularly
repetitive.
In
contrast,
eukaryotic
orders
magnitude
more
than
expected.
preferentially
located
at
functional
loci
specific
taxa.
Finally,
utilize
recently
completed
Telomere-to-Telomere
other
great
apes,
highly
abundant
variable
primate
species,
peri/centromeric
regions.
conclude
have
expanded
viral
lineages
archaea
or
bacteria,
resulting
discrepancies
genomic
composition.
Nucleic Acids Research,
Год журнала:
2023,
Номер
52(D1), С. D18 - D32
Опубликована: Ноя. 29, 2023
Abstract
The
National
Genomics
Data
Center
(NGDC),
which
is
a
part
of
the
China
for
Bioinformation
(CNCB),
provides
family
database
resources
to
support
global
academic
and
industrial
communities.
With
rapid
accumulation
multi-omics
data
at
an
unprecedented
pace,
CNCB-NGDC
continuously
expands
updates
core
through
big
archiving,
integrative
analysis
value-added
curation.
Importantly,
NGDC
collaborates
closely
with
major
international
databases
initiatives
ensure
seamless
exchange
interoperability.
Over
past
year,
significant
efforts
have
been
dedicated
integrating
diverse
omics
data,
synthesizing
expanding
knowledge,
developing
new
resources,
upgrading
existing
resources.
Particularly,
several
are
newly
developed
biodiversity
protists
(P10K),
bacteria
(NTM-DB,
MPA)
as
well
plant
(PPGR,
SoyOmics,
PlantPan)
disease/trait
association
(CROST,
HervD
Atlas,
HALL,
MACdb,
BioKA,
RePoS,
PGG.SV,
NAFLDkb).
All
services
publicly
accessible
https://ngdc.cncb.ac.cn.
Communications Biology,
Год журнала:
2023,
Номер
6(1)
Опубликована: Сен. 19, 2023
Abstract
Repetitive
DNA
sequences
playing
critical
roles
in
driving
evolution,
inducing
variation,
and
regulating
gene
expression.
In
this
review,
we
summarized
the
definition,
arrangement,
structural
characteristics
of
repeats.
Besides,
introduced
diverse
biological
functions
repeats
reviewed
existing
methods
for
automatic
repeat
detection,
classification,
masking.
Finally,
analyzed
type,
structure,
regulation
human
genome
their
role
induction
complex
diseases.
We
believe
that
review
will
facilitate
a
comprehensive
understanding
provide
guidance
annotation
in-depth
exploration
its
association
with
Cell,
Год журнала:
2024,
Номер
187(9), С. 2336 - 2341.e5
Опубликована: Апрель 1, 2024
The
Genome
Aggregation
Database
(gnomAD),
widely
recognized
as
the
gold-standard
reference
map
of
human
genetic
variation,
has
largely
overlooked
tandem
repeat
(TR)
expansions,
despite
fact
that
TRs
constitute
∼6%
our
genome
and
are
linked
to
over
50
diseases.
Here,
we
introduce
TR-gnomAD
(https://wlcb.oit.uci.edu/TRgnomAD),
a
biobank-scale
0.86
million
derived
from
338,963
whole-genome
sequencing
(WGS)
samples
diverse
ancestries
(39.5%
non-European
samples).
offers
critical
insights
into
ancestry-specific
disease
prevalence
using
disparities
in
TR
unit
number
frequencies
among
ancestries.
Moreover,
is
able
differentiate
between
common,
presumably
benign
which
prevalent
TR-gnomAD,
those
potentially
pathogenic
found
more
frequently
groups
than
within
TR-gnomAD.
Together,
an
invaluable
resource
for
researchers
physicians
interpret
expansions
individuals
with
Nature Communications,
Год журнала:
2025,
Номер
16(1)
Опубликована: Янв. 28, 2025
Studies
of
the
genetics
Alzheimer's
disease
(AD)
have
largely
focused
on
single
nucleotide
variants
and
short
insertions/deletions.
However,
most
heritability
has
yet
to
be
uncovered,
suggesting
that
there
is
substantial
genetic
risk
conferred
by
other
forms
variation.
There
are
over
one
million
tandem
repeats
(STRs)
in
genome,
their
link
AD
not
been
assessed.
As
pathogenic
expansions
STR
cause
30
neurologic
diseases,
it
important
ascertain
whether
STRs
may
also
implicated
risk.
Here,
we
genotype
312,731
polymorphic
tracts
genome-wide
using
PCR-free
whole
genome
sequencing
data
from
2981
individuals
(1489
case
1492
control
individuals).
We
implement
an
approach
identify
as
with
tract
lengths
outliers
population.
then
test
for
differences
aggregate
burden
versus
individuals.
patients
harbor
a
1.19-fold
increase
compared
healthy
elderly
controls
(p
=
8.27×10-3,
two-sided
Mann-Whitney
test).
Individuals
carrying
>30
3.69-fold
higher
odds
having
more
severe
neuropathology.
highly
enriched
within
active
promoters
post-mortem
hippocampal
brain
tissues
particularly
SINE-VNTR-Alu
(SVA)
retrotransposons.
Together,
these
results
demonstrate
expanded
promoter
regions
associate
AD.
The
authors
explore
how
DNA
sequences
affect
disease.
They
find
who
carry
high
than
three-fold
increased
European Journal of Human Genetics,
Год журнала:
2024,
Номер
32(5), С. 584 - 587
Опубликована: Фев. 2, 2024
Abstract
To
date,
approximately
50
short
tandem
repeat
(STR)
disorders
have
been
identified;
yet,
clinical
laboratories
rarely
conduct
STR
analysis
on
exomes.
assess
its
diagnostic
value,
we
analyzed
STRs
in
6099
exomes
from
2510
families
with
mostly
suspected
neurogenetic
disorders.
We
employed
ExpansionHunter
and
REViewer
to
detect
pathogenic
expansions,
confirming
them
using
orthogonal
methods.
Genotype-phenotype
correlations
led
the
diagnosis
of
thirteen
individuals
seven
previously
undiagnosed
families,
identifying
three
autosomal
dominant
disorders:
dentatorubral-pallidoluysian
atrophy
(
n
=
3),
spinocerebellar
ataxia
type
7
2),
myotonic
dystrophy
1
resulting
a
gain
0.28%
(7/2510).
Additionally,
found
expanded
ATXN1
alleles
(≥39
repeats)
varying
patterns
CAT
interruptions
twelve
individuals,
accounting
for
0.19%
Korean
population.
Our
study
underscores
importance
integrating
into
exome
sequencing
pipeline,
broadening
application
assessments.
Scientific Reports,
Год журнала:
2024,
Номер
14(1)
Опубликована: Фев. 9, 2024
Short
tandem
repeat
(STR)
mutations
are
prevalent
in
colorectal
cancer
(CRC),
especially
tumours
with
the
microsatellite
instability
(MSI)
phenotype.
While
STR
length
variations
known
to
regulate
gene
expression
under
physiological
conditions,
functional
impact
of
CRC
remains
unclear.
Here,
we
integrate
mutation
data
clinical
information
and
study
regulatory
effects
CRC.
We
confirm
that
mutability
highly
depends
on
MSI
status,
unit
size,
length.
Furthermore,
present
a
set
1244
putative
STRs
(eSTRs)
for
which
is
associated
levels
tumours.
The
73
eSTRs
cancer-related
genes,
nine
CRC-specific
genes.
show
linear
models
describing
eSTR-gene
relationships
allow
predictions
changes
response
eSTR
mutations.
Moreover,
found
an
increased
Our
evidence
roles
highlights
mostly
overlooked
way
through
may
modulate
their
phenotypes.
Future
extensions
these
findings
could
uncover
new
STR-based
targets
treatment
cancer.
Nature Communications,
Год журнала:
2024,
Номер
15(1)
Опубликована: Авг. 24, 2024
Tandem
repeats
(TRs)
are
genomic
regions
that
tandemly
change
in
repeat
number,
which
often
multiallelic.
Their
characteristics
and
contributions
to
gene
expression
quantitative
traits
rice
largely
unknown.
Here,
we
survey
TR
variations
based
on
231
genome
assemblies
the
pan-genome
graph.
We
identify
227,391
multiallelic
loci,
including
54,416
absent
from
Nipponbare
reference
genome.
Only
1/3
show
strong
linkage
with
nearby
bi-allelic
variants
(SNPs,
Indels
PAVs).
Using
193
panicle
202
leaf
transcriptomic
data,
reveal
485
511
TRs
act
as
QTLs
independently
of
other
expression,
respectively.
plant
height
grain
width
examples,
validate
agronomic
trait
variations.
These
findings
would
enhance
our
understanding
functions
facilitate
molecular
breeding.
have
unique
ability
drive
a
range
phenotype
authors
graph,
associated
expressed
genes,
contributed
Science Bulletin,
Год журнала:
2023,
Номер
68(20), С. 2391 - 2404
Опубликована: Авг. 15, 2023
Characterizing
natural
selection
signatures
and
relationships
with
phenotype
spectra
is
important
for
understanding
human
evolution
both
biological
pathological
mechanisms.
Here,
we
identified
24
genetic
loci
under
recent
by
analyzing
rare
singletons
in
3946
high-depth
whole-genome
sequencing
data
of
Han
Chinese.
The
include
immune-related
gene
regions
(MHC
cluster,
IGH
STING1,
PSG),
alcohol
metabolism-related
(ADH1B,
ALDH2,
ALDH3B2),
the
olfactory
perception
OR4C16,
which
MHC
ADH1B
ALDH2
were
also
TOPMed
WestLake
Biobank.
Among
signals,
cluster
particularly
interesting,
favored
allele
variant
14_105737776_C_T
(rs117518546,
IgG1-G396R)
promotes
immune
response,
but
increases
risk
an
autoimmune
disease
systemic
lupus
erythematosus
(SLE).
It
surprising
that
our
newly
discovered
ALDH3B2
evolved
opposite
direction
to
metabolism.
Besides
monogenic
traits,
found
multiple
complex
traits
experienced
polygenic
adaptation.
Particularly,
multi-methods
consistently
revealed
lower
blood
pressure
was
selection.
Finally,
built
a
database
named
RePoS
(Recent
Positive
Selection,
http://bigdata.ibp.ac.cn/RePoS/)
integrate
display
multi-population
signals.
Our
study
extended
adaptation
Chinese
as
well
other
populations.
Genome Biology and Evolution,
Год журнала:
2024,
Номер
16(3)
Опубликована: Фев. 27, 2024
Microsatellites
are
widely
used
in
population
genetics,
but
their
evolutionary
dynamics
remain
poorly
understood.
It
is
unclear
whether
microsatellite
loci
drift
length
over
time.
This
important
because
the
mutation
processes
that
underlie
these
genetic
markers
central
to
models
employ
microsatellites.
We
identify
more
than
27
million
microsatellites
using
a
novel
and
unique
dataset
of
modern
ancient
Adélie
penguin
genomes
along
with
data
from
63
published
chordate
genomes.
investigate
2
timescales:
one
based
on
samples
dating
∼46.5
ka
other
diversification
chordates
aged
500
Ma.
show
process
allele
evolution
at
dynamic
equilibrium;
while
there
polymorphism
among
individuals,
distribution
for
given
locus
remains
stable.
Many
persist
very
long
timescales,
particularly
exons
regulatory
sequences.
These
often
retain
variability,
suggesting
they
may
play
role
maintaining
phenotypic
variation
within
populations.