bioRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2024,
Номер
unknown
Опубликована: Окт. 5, 2024
Abstract
Tandem
repeat
(TR)
catalogs
are
important
components
of
genotyping
studies
as
they
define
the
genomic
coordinates
and
expected
motifs
all
TR
loci
being
analyzed.
In
recent
years,
genome-wide
have
used
ranging
in
size
from
fewer
than
200,000
to
over
7
million
loci.
Where
these
overlapped,
often
disagreed
on
locus
boundaries,
hindering
comparison
reuse
results
across
studies.
Now,
with
multiple
groups
developing
public
databases
variation
large
population
cohorts,
there
is
a
risk
that,
without
sufficient
consensus
choice
definitions,
use
divergent
will
lead
confusion,
fragmentation,
incompatibility
future
resources.
this
paper,
we
compare
existing
discuss
desirable
features
comprehensive
catalog.
We
then
present
new,
richly
annotated
catalog
designed
for
large-scale
analyses
databases.
Our
stratifies
TRs
into
two
groups:
1)
isolated
suitable
copy
number
analysis
using
short
read
or
long
data
2)
so-called
clusters
that
contain
within
wider
polymorphic
regions
best
studied
through
sequence-level
analysis.
To
clusters,
novel
algorithm
leverages
long-read
HiFi
sequencing
group
repeats
surrounding
polymorphisms.
show
human
genome
contains
at
least
25,000
complex
most
which
span
120
bp
five
more
TRs.
Resolving
sequence
entire
instead
individually
constituent
leads
accurate
enables
us
profile
would
been
missed
otherwise.
medRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2024,
Номер
unknown
Опубликована: Март 7, 2024
Less
than
half
of
individuals
with
a
suspected
Mendelian
condition
receive
precise
molecular
diagnosis
after
comprehensive
clinical
genetic
testing.
Improvements
in
data
quality
and
costs
have
heightened
interest
using
long-read
sequencing
(LRS)
to
streamline
genomic
testing,
but
the
absence
control
datasets
for
variant
filtering
prioritization
has
made
tertiary
analysis
LRS
challenging.
To
address
this,
1000
Genomes
Project
ONT
Sequencing
Consortium
aims
generate
from
at
least
800
samples.
Our
goal
is
use
identify
broader
spectrum
variation
so
we
may
improve
our
understanding
normal
patterns
human
variation.
Here,
present
first
100
samples,
representing
all
5
superpopulations
19
subpopulations.
These
sequenced
an
average
depth
coverage
37x
sequence
read
N50
54
kbp,
high
concordance
previous
studies
identifying
single
nucleotide
indel
variants
outside
homopolymer
regions.
Using
multiple
structural
(SV)
callers,
24,543
high-confidence
SVs
per
genome,
including
shared
private
likely
disrupt
gene
function
as
well
pathogenic
expansions
within
disease-associated
repeats
that
were
not
detected
short
reads.
Evaluation
methylation
signatures
revealed
expected
known
imprinted
loci,
samples
skewed
X-inactivation
patterns,
novel
differentially
methylated
All
raw
data,
processed
summary
statistics
are
publicly
available,
providing
valuable
resource
genetics
community
discover
SVs.
Journal of genetics and genomics/Journal of Genetics and Genomics,
Год журнала:
2024,
Номер
unknown
Опубликована: Июль 1, 2024
Genetic
genealogy
provides
crucial
insights
into
the
complex
biological
relationships
within
contemporary
and
ancient
human
populations
by
analyzing
shared
alleles
chromosomal
segments
that
are
identical
descent,
to
understand
kinship,
migration
patterns,
population
dynamics.
Within
forensic
science,
investigative
genetic
(FIGG)
has
gained
prominence
leveraging
next-generation
sequencing
technologies
population-specific
genomic
resources,
opening
new
avenues.
In
this
review,
we
synthesize
current
knowledge,
underscore
recent
advancements,
discuss
growing
role
of
FIGG
in
genomics.
been
pivotal
revitalizing
dormant
inquiries
offering
leads
numerous
cold
cases.
Its
effectiveness
relies
on
extensive
SNP
profiles
contributed
individuals
from
diverse
specialized
databases.
Advances
computational
genomics
growth
databases
have
spurred
a
profound
shift
application
across
forensics,
anthropology,
DNA
studies.
As
field
progresses,
is
evolving
nascent
practice
more
sophisticated
discipline,
shaping
future
investigations.
Abstract
Oxford
Nanopore
Technology
(ONT)
sequencing
is
a
third-generation
technology
that
enables
cost-effective
long-read
sequencing,
with
broad
applications
in
biological
research.
However,
its
high
error
rate
low-complexity
regions
hampers
short
tandem
repeat
(STR)–related
To
address
this,
we
generated
comprehensive
STR
profile
of
ONT
by
analyzing
publicly
available
datasets.
We
show
the
influenced
not
only
length
but
also
unit
and
flanking
sequences
regions.
Interestingly,
certain
were
associated
higher
accuracy,
suggesting
loci
are
more
suitable
for
compared
to
other
loci.
While
base
quality
scores
substitution
errors
within
lower
than
those
correctly
sequenced
bases,
such
patterns
observed
indel
errors.
Furthermore,
choosing
most
recent
basecaller
version
using
super
accuracy
model
significantly
improved
accuracy.
Finally,
present
NanoMnT,
lightweight
Python
tool
corrects
data
estimates
allele
sizes.
NanoMnT
leverages
characteristics
when
estimating
size
exhibits
superior
results
1-bp-
2-bp
existing
tools.
By
integrating
our
findings,
estimation
Ax10
repeats
from
55%
78%
up
85%
excluding
unfavorable
sequences.
Using
utility
findings
identifying
microsatellite
instability
status
cancer
data.
at
https://github.com/18parkky/NanoMnT.
Revue Neurologique,
Год журнала:
2024,
Номер
180(5), С. 383 - 392
Опубликована: Апрель 8, 2024
Tandem
repeats
are
a
common,
highly
polymorphic
class
of
variation
in
human
genomes.
Their
expansion
beyond
pathogenic
threshold
is
process
that
contributes
to
wide
range
neurological
and
neuromuscular
genetic
disorders,
which
over
60
have
been
identified
date.
The
last
few
years
seen
resurgence
repeat
discovery
propelled
by
technological
advancements,
enabling
the
identification
20
novel
disorders.
These
expansions
can
occur
coding
or
non-coding
regions
genes,
resulting
mechanisms.
In
this
article,
we
review
strategies,
tools
methods
be
used
for
efficient
detection
characterization
known
new
Features
prioritize
include
anticipation,
characterized
increased
severity
earlier
onset
symptoms
across
generations,
founder
effects,
contribute
higher
prevalence
rates
certain
populations.
Classical
technologies
such
as
Southern
blotting,
repeat-primed
polymerase
chain
reaction
(PCR)
long-range
PCR
still
detect
expansions,
although
they
usually
significant
limitations
linked
absence
sequence
context.
Targeted
sequencing
using
either
CRISPR-Cas9
enrichment
combined
with
long-read
adaptive
nanopore
sampling
better
but
more
expensive
alternatives.
development
bioinformatics
applied
short-read
genome
data
now
targeted
manner
at
genome-wide
level.
addition,
advances,
particularly
optical
mapping
(Bionano
Genomics),
Oxford
Nanopore
Technologies
(ONT)
Pacific
Biosciences
(PacBio)
HiFi
sequencing,
offer
promising
avenues
expansions.
Despite
challenges
specific
DNA
extraction
requirements,
computation
resources
needed
interpretation,
these
an
immense
potential
advance
our
understanding
disorders
improve
diagnostic
accuracy.
Human Genome Variation,
Год журнала:
2024,
Номер
11(1)
Опубликована: Апрель 17, 2024
Abstract
Short-
and
long-read
sequencing
technologies
are
routinely
used
to
detect
DNA
variants,
including
SNVs,
indels,
structural
variations
(SVs).
However,
the
differences
in
quality
quantity
of
variants
detected
between
short-
data
not
fully
understood.
In
this
study,
we
comprehensively
evaluated
variant
calling
performance
long-read-based
SNV,
indel,
SV
detection
algorithms
(6
for
12
13
SVs)
using
a
novel
evaluation
framework
incorporating
manual
visual
inspection.
The
results
showed
that
indel-insertion
calls
greater
than
10
bp
were
poorly
by
short-read-based
compared
algorithms;
however,
recall
precision
SNV
indel-deletion
similar
data.
with
was
significantly
lower
repetitive
regions,
especially
small-
intermediate-sized
SVs,
algorithms.
contrast,
nonrepetitive
regions
These
findings
suggest
need
refined
strategies,
such
as
multiple
algorithms,
generate
more
complete
set
short-read
Clinical Chemistry and Laboratory Medicine (CCLM),
Год журнала:
2025,
Номер
unknown
Опубликована: Март 19, 2025
Abstract
Objectives
Many
patients
with
sex
chromosome
abnormalities
(SCAs)
are
diagnosed
late
in
life
or
remain
undiagnosed,
leading
to
delayed
inadequate
medical
intervention
and
care.
This
study
aimed
develop
a
reliable,
rapid
cost-effective
test
for
identifying
SCAs
using
blood
sample
–
an
essential
step
toward
establishing
neonatal
screening
program.
Methods
A
total
of
360
samples
(180
SCA
patients,
180
controls)
were
obtained
from
four
cross-sectional
studies
adult
age-matched
controls.
Informed
consent
was
collected,
all
procedures
followed
the
Declaration
Helsinki.
Multiplex
quantitative
fluorescence
polymerase
chain
reaction
(QF-PCR)
utilizing
short
tandem
repeat
(STR)
X-linked
segmental
duplication
(SD)
markers
performed.
Results
analyzed
automated
algorithm.
Deviant
results
manually
reviewed
differentiate
errors
PCR
process
those
data
analysis.
Following
analysis
QF-PCR
results,
method
accurately
identified
174
(sensitivity:
96.7
%)
171
controls
(specificity:
95.0
%).
Mosaic
karyotypes
particularly
challenging
diagnose.
Manual
reanalysis
corrected
false
positives,
achieving
100
%
specificity.
Conclusions
is
promising
reliable
detection
samples,
offering
cost-effectiveness
scalability.
The
specificity
following
not
satisfactory.
underlying
technique,
however,
demonstrated
specificity,
indicating
that
refining
algorithm
would
significantly
reduce
positive
results.
With
further
refinements,
we
believe
this
be
highly
suitable
evaluation
newborn
setting.