Advancing genomic technologies and clinical awareness accelerates discovery of disease-associated tandem repeat sequences DOI Creative Commons

Terence Gall-Duncan,

Nozomu Sato, Ryan K. C. Yuen

et al.

Genome Research, Journal Year: 2021, Volume and Issue: 32(1), P. 1 - 27

Published: Dec. 29, 2021

Expansions of gene-specific DNA tandem repeats (TRs), first described in 1991 as a disease-causing mutation humans, are now known to cause >60 phenotypes, not just disease, and only humans. TRs common form genetic variation with biological consequences, observed, so far, dogs, plants, oysters, yeast. Repeat diseases show atypical clinical features, anticipation, multiple partially penetrant phenotypes among family members. Discovery repeat expansion loci accelerated through technological advances sequencing computational analyses. Between 2019 2021, 17 new TR expansions were reported, totaling 63 (>69 diseases), likelihood more discoveries, organisms. Recent historical lessons reveal that properly assessed presentations, coupled awareness, can guide discovery unstable TRs. We highlight critical but underrecognized aspects mutations. motifs may be present current reference genomes will forthcoming gapless long-read references. motif size single nucleotide kilobases/unit. At given locus, sequence purity vary consequence. Pathogenic “insertions” within nonpathogenic Expansions, contractions, somatic length variations have clinical/biological consequences. instabilities occur humans other epigenetically modified and/or chromosomal fragile sites. discuss the expanding field disease-associated instabilities, highlighting prospects, clues, tools, challenges for further discoveries understanding their pathological impacts—a vista is about expand.

Language: Английский

The complete sequence of a human genome DOI
Sergey Nurk, Sergey Koren, Arang Rhie

et al.

Science, Journal Year: 2022, Volume and Issue: 376(6588), P. 44 - 53

Published: March 31, 2022

Since its initial release in 2000, the human reference genome has covered only euchromatic fraction of genome, leaving important heterochromatic regions unfinished. Addressing remaining 8% Telomere-to-Telomere (T2T) Consortium presents a complete 3.055 billion–base pair sequence T2T-CHM13, that includes gapless assemblies for all chromosomes except Y, corrects errors prior references, and introduces nearly 200 million base pairs containing 1956 gene predictions, 99 which are predicted to be protein coding. The completed include centromeric satellite arrays, recent segmental duplications, short arms five acrocentric chromosomes, unlocking these complex variational functional studies.

Language: Английский

Citations

2151

Complete genomic and epigenetic maps of human centromeres DOI
Nicolas Altemose, Glennis A. Logsdon, Andrey V. Bzikadze

et al.

Science, Journal Year: 2022, Volume and Issue: 376(6588)

Published: March 31, 2022

Existing human genome assemblies have almost entirely excluded repetitive sequences within and near centromeres, limiting our understanding of their organization, evolution, functions, which include facilitating proper chromosome segregation. Now, a complete, telomere-to-telomere assembly (T2T-CHM13) has enabled us to comprehensively characterize pericentromeric centromeric repeats, constitute 6.2% the (189.9 megabases). Detailed maps these regions revealed multimegabase structural rearrangements, including in active repeat arrays. Analysis centromere-associated uncovered strong relationship between position centromere evolution surrounding DNA through layered expansions. Furthermore, comparisons X centromeres across diverse panel individuals illuminated high degrees structural, epigenetic, sequence variation complex rapidly evolving regions.

Language: Английский

Citations

374

A complete reference genome improves analysis of human genetic variation DOI
Sergey Aganezov, Stephanie M. Yan, Daniela C. Soto

et al.

Science, Journal Year: 2022, Volume and Issue: 376(6588)

Published: March 31, 2022

Compared to its predecessors, the Telomere-to-Telomere CHM13 genome adds nearly 200 million base pairs of sequence, corrects thousands structural errors, and unlocks most complex regions human for clinical functional study. We show how this reference universally improves read mapping variant calling 3202 17 globally diverse samples sequenced with short long reads, respectively. identify hundreds variants per sample in previously unresolved regions, showcasing promise T2T-CHM13 evolutionary biomedical discovery. Simultaneously, eliminates tens spurious sample, including reduction false positives 269 medically relevant genes by up a factor 12. Because these improvements discovery coupled population genomic resources, is positioned replace GRCh38 as prevailing genetics.

Language: Английский

Citations

275

From telomere to telomere: The transcriptional and epigenetic state of human repeat elements DOI
Savannah J. Hoyt, Jessica M. Storer, Gabrielle A. Hartley

et al.

Science, Journal Year: 2022, Volume and Issue: 376(6588)

Published: March 31, 2022

Mobile elements and repetitive genomic regions are sources of lineage-specific innovation uniquely fingerprint individual genomes. Comprehensive analyses such repeat elements, including those found in more complex the genome, require a complete, linear genome assembly. We present de novo discovery annotation T2T-CHM13 human reference genome. identified previously unknown satellite arrays, expanded catalog variants families for repeats mobile characterized classes composite repeats, located retroelement transduction events. detected nascent transcription delineated CpG methylation profiles to define structure transcriptionally active retroelements humans, centromeres. These data expand our insight into diversity, distribution, evolution that have shaped

Language: Английский

Citations

258

The complete sequence of a human Y chromosome DOI
Arang Rhie, Sergey Nurk, Monika Čechová

et al.

Nature, Journal Year: 2023, Volume and Issue: 621(7978), P. 344 - 354

Published: Aug. 23, 2023

Language: Английский

Citations

234

Epigenetic patterns in a complete human genome DOI
Ariel Gershman, Michael Sauria, Xavi Guitart

et al.

Science, Journal Year: 2022, Volume and Issue: 376(6588)

Published: March 31, 2022

The completion of a telomere-to-telomere human reference genome, T2T-CHM13, has resolved complex regions the including repetitive and homologous regions. Here, we present high-resolution epigenetic study previously unresolved sequences, representing entire acrocentric chromosome short arms, gene family expansions, diverse collection repeat classes. This resource precisely maps CpG methylation (32.28 million CpGs), DNA accessibility, short-read datasets (166,058 chromatin immunoprecipitation sequencing peaks) to provide evidence activity across unidentified or corrected genes reveals clinically relevant paralog-specific regulation. Probing centromeres from six individuals generated an estimate variability in kinetochore localization. analysis provides framework with which investigate most elusive granting insights into

Language: Английский

Citations

214

Taming transposable elements in vertebrates: from epigenetic silencing to domestication DOI
Miguel Vasconcelos Almeida, Grégoire Vernaz, Audrey L. K. Putman

et al.

Trends in Genetics, Journal Year: 2022, Volume and Issue: 38(6), P. 529 - 553

Published: March 17, 2022

Language: Английский

Citations

97

Recombination between heterologous human acrocentric chromosomes DOI Creative Commons
Andrea Guarracino, Silvia Buonaiuto, Leonardo Gomes de Lima

et al.

Nature, Journal Year: 2023, Volume and Issue: 617(7960), P. 335 - 343

Published: May 10, 2023

Abstract The short arms of the human acrocentric chromosomes 13, 14, 15, 21 and 22 (SAACs) share large homologous regions, including ribosomal DNA repeats extended segmental duplications 1,2 . Although resolution these regions in first complete assembly a genome—the Telomere-to-Telomere Consortium’s CHM13 (T2T-CHM13)—provided model their homology 3 , it remained unclear whether patterns were ancestral or maintained by ongoing recombination exchange. Here we show that contain pseudo-homologous (PHRs) indicative between non-homologous sequences. Utilizing an all-to-all comparison pangenome from Human Pangenome Reference Consortium 4 (HPRC), find contigs all SAACs form community. A variation graph 5 constructed centromere-spanning indicates presence which most appear nearly identical heterologous T2T-CHM13. Except on chromosome observe faster decay linkage disequilibrium than corresponding long arms, indicating higher rates 6,7 include sequences have previously been shown to lie at breakpoint Robertsonian translocations 8 arrangement is compatible with crossover inverted 14 21. ubiquity signals seen HPRC draft suggests shared basis for recurrent translocations, providing sequence population-based confirmation hypotheses developed cytogenetic studies 50 years ago 9

Language: Английский

Citations

86

Genomics in the long-read sequencing era DOI Creative Commons

Erwin L. van Dijk,

Delphine Naquin, Kévin Gorrichon

et al.

Trends in Genetics, Journal Year: 2023, Volume and Issue: 39(9), P. 649 - 671

Published: May 23, 2023

Language: Английский

Citations

73

Long-Read DNA Sequencing: Recent Advances and Remaining Challenges DOI Creative Commons

Peter E. Warburton,

Robert Sebra

Annual Review of Genomics and Human Genetics, Journal Year: 2023, Volume and Issue: 24(1), P. 109 - 132

Published: April 19, 2023

DNA sequencing has revolutionized medicine over recent decades. However, analysis of large structural variation and repetitive DNA, a hallmark human genomes, been limited by short-read technology, with read lengths 100-300 bp. Long-read (LRS) permits routine fragments tens to hundreds kilobase pairs in size, using both real-time synthesis nanopore-based direct electronic sequencing. LRS haplotypic phasing genomes enabled the discovery characterization rare pathogenic variants repeat expansions. It also recently assembly complete, gapless genome that includes previously intractable regions, such as highly centromeres homologous acrocentric short arms. With addition protocols for targeted enrichment, epigenetic modification detection, long-range chromatin profiling, promises launch new era understanding genetic diversity mutations populations.

Language: Английский

Citations

67