The repertoire of short tandem repeats across the tree of life DOI Creative Commons
Nikol Chantzi, Ilias Georgakopoulos-Soares

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2024, Номер unknown

Опубликована: Авг. 9, 2024

Abstract Short tandem repeats (STRs) are widespread, dynamic repetitive elements with a number of biological functions and relevance to human diseases. However, their prevalence across taxa remains poorly characterized. Here we examined the impact STRs in genomes 117,253 organisms spanning tree life. We find that there large differences frequencies between organismal these largely driven by taxonomic group an organism belongs to. Using simulated genomes, on average is no enrichment bacterial archaeal suggesting not particularly repetitive. In contrast, eukaryotic orders magnitude more than expected. preferentially located at functional loci specific taxa. Finally, utilize recently completed Telomere-to-Telomere other great apes, highly abundant variable primate species, peri/centromeric regions. conclude have expanded viral lineages archaea or bacteria, resulting discrepancies genomic composition.

Язык: Английский

Insights from a genome-wide truth set of tandem repeat variation DOI Creative Commons
Ben Weisburd, Grace Tiao, Heidi L. Rehm

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2023, Номер unknown

Опубликована: Май 8, 2023

Tools for genotyping tandem repeats (TRs) from short read sequencing data have improved significantly over the past decade. Extensive comparisons of these tools to gold standard diagnostic methods like RP-PCR confirmed their accuracy tens hundreds well-studied loci. However, a scarcity high-quality orthogonal truth limited our ability measure tool millions other loci throughout genome. To address this, we developed TR set based on Synthetic Diploid Benchmark (SynDip). By identifying subset insertions and deletions that represent expansions or contractions with motifs between 2 50 base pairs, obtained accurate genotypes 139,795 pure 6,845 interrupted in single diploid sample. Our approach did not require running existing long provided an alternative, more view repeat variation. We applied this compare strengths weaknesses widely-used TRs, evaluated completeness genome-wide catalogs, explored properties variation found that, without filtering, ExpansionHunter had higher than GangSTR HipSTR wide range allele sizes. Also, when errors size occurred, tended overestimate expansion sizes, while underestimate them. Additionally, saw catalogs miss 16% 41% variant set. These results suggest analyses would benefit larger as well further development builds current algorithms. end, new catalog 2.8 million captures 95% set, created modified version runs 3x faster original producing same output.

Язык: Английский

Процитировано

9

Comprehensive landscape of non-CODIS STRs in global populations provides new insights into challenging DNA profiles DOI
Yuguo Huang, Mengge Wang, Chao Liu

и другие.

Forensic Science International Genetics, Год журнала: 2024, Номер 70, С. 103010 - 103010

Опубликована: Янв. 21, 2024

Язык: Английский

Процитировано

3

STRchive: a dynamic resource detailing population-level and locus-specific insights at tandem repeat disease loci DOI Creative Commons
Laurel Hiatt, Ben Weisburd, Egor Dolzhenko

и другие.

medRxiv (Cold Spring Harbor Laboratory), Год журнала: 2024, Номер unknown

Опубликована: Май 21, 2024

Abstract Approximately 3% of the human genome consists repetitive elements called tandem repeats (TRs), which include short (STRs) 1–6bp motifs and variable number (VNTRs) 7+bp motifs. TR variants contribute to several dozen mono- polygenic diseases but remain understudied “enigmatic,” particularly relative single nucleotide variants. It remains comparatively challenging interpret clinical significance Although existing resources provide portions necessary data for interpretation at disease-associated loci, it is currently difficult or impossible efficiently invoke additional details critical proper interpretation, such as motif pathogenicity, disease penetrance, age onset distributions. also often unclear how apply population information analyses. We present STRchive (S-T-archive, http://strchive.org/ ), a dynamic resource consolidating on loci in humans from research literature, up-to-date resources, large-scale genomic databases, with goal streamlining variant loci. —including pathogenic thresholds, classification, phenotypes—to gnomAD cohort ∼18.5k individuals genotyped 60 Through detailed literature curation, we demonstrate that majority affect children despite being thought adult diseases. Additionally, show genotypes can be found within do not necessarily overlap known prevalence, leverage locus-specific findings therein. diagnostic blueprint empowered by relevant vignettes, highlighting possible pitfalls interpretation. As living resource, maintained experts, takes community contributions, will evolve understanding progresses.

Язык: Английский

Процитировано

3

High-Resolution NMR Structures of Intrastrand Hairpins Formed by CTG Trinucleotide Repeats DOI
Liqi Wan, Axin He, Jinxing Li

и другие.

ACS Chemical Neuroscience, Год журнала: 2024, Номер 15(4), С. 868 - 876

Опубликована: Фев. 6, 2024

The CAG and CTG trinucleotide repeat expansions cause more than 10 human neurodegenerative diseases. Intrastrand hairpins formed by repeats contribute to expansions, establishing them as potential drug targets. High-resolution structural determination of poses a long-standing goal aid development, yet it has not been realized due the intrinsic conformational flexibility repetitive sequences. We herein investigate solution structures using nuclear magnetic resonance (NMR) spectroscopy found that four with clamping G-C base pair was able form stable hairpin structure. determine first NMR structure dG(CTG)

Язык: Английский

Процитировано

2

The repertoire of short tandem repeats across the tree of life DOI Creative Commons
Nikol Chantzi, Ilias Georgakopoulos-Soares

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2024, Номер unknown

Опубликована: Авг. 9, 2024

Abstract Short tandem repeats (STRs) are widespread, dynamic repetitive elements with a number of biological functions and relevance to human diseases. However, their prevalence across taxa remains poorly characterized. Here we examined the impact STRs in genomes 117,253 organisms spanning tree life. We find that there large differences frequencies between organismal these largely driven by taxonomic group an organism belongs to. Using simulated genomes, on average is no enrichment bacterial archaeal suggesting not particularly repetitive. In contrast, eukaryotic orders magnitude more than expected. preferentially located at functional loci specific taxa. Finally, utilize recently completed Telomere-to-Telomere other great apes, highly abundant variable primate species, peri/centromeric regions. conclude have expanded viral lineages archaea or bacteria, resulting discrepancies genomic composition.

Язык: Английский

Процитировано

2