The repertoire of short tandem repeats across the tree of life DOI Creative Commons
Nikol Chantzi, Ilias Georgakopoulos-Soares

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2024, Номер unknown

Опубликована: Авг. 9, 2024

Abstract Short tandem repeats (STRs) are widespread, dynamic repetitive elements with a number of biological functions and relevance to human diseases. However, their prevalence across taxa remains poorly characterized. Here we examined the impact STRs in genomes 117,253 organisms spanning tree life. We find that there large differences frequencies between organismal these largely driven by taxonomic group an organism belongs to. Using simulated genomes, on average is no enrichment bacterial archaeal suggesting not particularly repetitive. In contrast, eukaryotic orders magnitude more than expected. preferentially located at functional loci specific taxa. Finally, utilize recently completed Telomere-to-Telomere other great apes, highly abundant variable primate species, peri/centromeric regions. conclude have expanded viral lineages archaea or bacteria, resulting discrepancies genomic composition.

Язык: Английский

Database Resources of the National Genomics Data Center, China National Center for Bioinformation in 2024 DOI Creative Commons
Xue Bai, Yīmíng Bào,

Shaoqi Bei

и другие.

Nucleic Acids Research, Год журнала: 2023, Номер 52(D1), С. D18 - D32

Опубликована: Ноя. 29, 2023

Abstract The National Genomics Data Center (NGDC), which is a part of the China for Bioinformation (CNCB), provides family database resources to support global academic and industrial communities. With rapid accumulation multi-omics data at an unprecedented pace, CNCB-NGDC continuously expands updates core through big archiving, integrative analysis value-added curation. Importantly, NGDC collaborates closely with major international databases initiatives ensure seamless exchange interoperability. Over past year, significant efforts have been dedicated integrating diverse omics data, synthesizing expanding knowledge, developing new resources, upgrading existing resources. Particularly, several are newly developed biodiversity protists (P10K), bacteria (NTM-DB, MPA) as well plant (PPGR, SoyOmics, PlantPan) disease/trait association (CROST, HervD Atlas, HALL, MACdb, BioKA, RePoS, PGG.SV, NAFLDkb). All services publicly accessible https://ngdc.cncb.ac.cn.

Язык: Английский

Процитировано

137

Repetitive DNA sequence detection and its role in the human genome DOI Creative Commons
Xingyu Liao,

Wufei Zhu,

Juexiao Zhou

и другие.

Communications Biology, Год журнала: 2023, Номер 6(1)

Опубликована: Сен. 19, 2023

Abstract Repetitive DNA sequences playing critical roles in driving evolution, inducing variation, and regulating gene expression. In this review, we summarized the definition, arrangement, structural characteristics of repeats. Besides, introduced diverse biological functions repeats reviewed existing methods for automatic repeat detection, classification, masking. Finally, analyzed type, structure, regulation human genome their role induction complex diseases. We believe that review will facilitate a comprehensive understanding provide guidance annotation in-depth exploration its association with

Язык: Английский

Процитировано

61

Sequencing and characterizing short tandem repeats in the human genome DOI
Hope A. Tanudisastro, Ira W. Deveson, Harriet Dashnow

и другие.

Nature Reviews Genetics, Год журнала: 2024, Номер 25(7), С. 460 - 475

Опубликована: Фев. 16, 2024

Язык: Английский

Процитировано

40

A genome-wide spectrum of tandem repeat expansions in 338,963 humans DOI Creative Commons
Ya Cui, Wenbin Ye, Jason Sheng Li

и другие.

Cell, Год журнала: 2024, Номер 187(9), С. 2336 - 2341.e5

Опубликована: Апрель 1, 2024

The Genome Aggregation Database (gnomAD), widely recognized as the gold-standard reference map of human genetic variation, has largely overlooked tandem repeat (TR) expansions, despite fact that TRs constitute ∼6% our genome and are linked to over 50 diseases. Here, we introduce TR-gnomAD (https://wlcb.oit.uci.edu/TRgnomAD), a biobank-scale 0.86 million derived from 338,963 whole-genome sequencing (WGS) samples diverse ancestries (39.5% non-European samples). offers critical insights into ancestry-specific disease prevalence using disparities in TR unit number frequencies among ancestries. Moreover, is able differentiate between common, presumably benign which prevalent TR-gnomAD, those potentially pathogenic found more frequently groups than within TR-gnomAD. Together, an invaluable resource for researchers physicians interpret expansions individuals with

Язык: Английский

Процитировано

16

Polygenic burden of short tandem repeat expansions promotes risk for Alzheimer’s disease DOI Creative Commons
Michael H. Guo, Wan‐Ping Lee, Badri N. Vardarajan

и другие.

Nature Communications, Год журнала: 2025, Номер 16(1)

Опубликована: Янв. 28, 2025

Studies of the genetics Alzheimer's disease (AD) have largely focused on single nucleotide variants and short insertions/deletions. However, most heritability has yet to be uncovered, suggesting that there is substantial genetic risk conferred by other forms variation. There are over one million tandem repeats (STRs) in genome, their link AD not been assessed. As pathogenic expansions STR cause 30 neurologic diseases, it important ascertain whether STRs may also implicated risk. Here, we genotype 312,731 polymorphic tracts genome-wide using PCR-free whole genome sequencing data from 2981 individuals (1489 case 1492 control individuals). We implement an approach identify as with tract lengths outliers population. then test for differences aggregate burden versus individuals. patients harbor a 1.19-fold increase compared healthy elderly controls (p = 8.27×10-3, two-sided Mann-Whitney test). Individuals carrying >30 3.69-fold higher odds having more severe neuropathology. highly enriched within active promoters post-mortem hippocampal brain tissues particularly SINE-VNTR-Alu (SVA) retrotransposons. Together, these results demonstrate expanded promoter regions associate AD. The authors explore how DNA sequences affect disease. They find who carry high than three-fold increased

Язык: Английский

Процитировано

2

Diagnostic uplift through the implementation of short tandem repeat analysis using exome sequencing DOI Creative Commons
Jihoon G. Yoon, Seungbok Lee, Jaeso Cho

и другие.

European Journal of Human Genetics, Год журнала: 2024, Номер 32(5), С. 584 - 587

Опубликована: Фев. 2, 2024

Abstract To date, approximately 50 short tandem repeat (STR) disorders have been identified; yet, clinical laboratories rarely conduct STR analysis on exomes. assess its diagnostic value, we analyzed STRs in 6099 exomes from 2510 families with mostly suspected neurogenetic disorders. We employed ExpansionHunter and REViewer to detect pathogenic expansions, confirming them using orthogonal methods. Genotype-phenotype correlations led the diagnosis of thirteen individuals seven previously undiagnosed families, identifying three autosomal dominant disorders: dentatorubral-pallidoluysian atrophy ( n = 3), spinocerebellar ataxia type 7 2), myotonic dystrophy 1 resulting a gain 0.28% (7/2510). Additionally, found expanded ATXN1 alleles (≥39 repeats) varying patterns CAT interruptions twelve individuals, accounting for 0.19% Korean population. Our study underscores importance integrating into exome sequencing pipeline, broadening application assessments.

Язык: Английский

Процитировано

13

Short tandem repeat mutations regulate gene expression in colorectal cancer DOI Creative Commons
Max A. Verbiest, Oxana Lundström, Feifei Xia

и другие.

Scientific Reports, Год журнала: 2024, Номер 14(1)

Опубликована: Фев. 9, 2024

Short tandem repeat (STR) mutations are prevalent in colorectal cancer (CRC), especially tumours with the microsatellite instability (MSI) phenotype. While STR length variations known to regulate gene expression under physiological conditions, functional impact of CRC remains unclear. Here, we integrate mutation data clinical information and study regulatory effects CRC. We confirm that mutability highly depends on MSI status, unit size, length. Furthermore, present a set 1244 putative STRs (eSTRs) for which is associated levels tumours. The 73 eSTRs cancer-related genes, nine CRC-specific genes. show linear models describing eSTR-gene relationships allow predictions changes response eSTR mutations. Moreover, found an increased Our evidence roles highlights mostly overlooked way through may modulate their phenotypes. Future extensions these findings could uncover new STR-based targets treatment cancer.

Язык: Английский

Процитировано

5

The pan-tandem repeat map highlights multiallelic variants underlying gene expression and agronomic traits in rice DOI Creative Commons

Huiying He,

Yue Leng,

Xinglan Cao

и другие.

Nature Communications, Год журнала: 2024, Номер 15(1)

Опубликована: Авг. 24, 2024

Tandem repeats (TRs) are genomic regions that tandemly change in repeat number, which often multiallelic. Their characteristics and contributions to gene expression quantitative traits rice largely unknown. Here, we survey TR variations based on 231 genome assemblies the pan-genome graph. We identify 227,391 multiallelic loci, including 54,416 absent from Nipponbare reference genome. Only 1/3 show strong linkage with nearby bi-allelic variants (SNPs, Indels PAVs). Using 193 panicle 202 leaf transcriptomic data, reveal 485 511 TRs act as QTLs independently of other expression, respectively. plant height grain width examples, validate agronomic trait variations. These findings would enhance our understanding functions facilitate molecular breeding. have unique ability drive a range phenotype authors graph, associated expressed genes, contributed

Язык: Английский

Процитировано

5

Recent positive selection signatures reveal phenotypic evolution in the Han Chinese population DOI Creative Commons
Huaxia Luo, Peng Zhang, Wanyu Zhang

и другие.

Science Bulletin, Год журнала: 2023, Номер 68(20), С. 2391 - 2404

Опубликована: Авг. 15, 2023

Characterizing natural selection signatures and relationships with phenotype spectra is important for understanding human evolution both biological pathological mechanisms. Here, we identified 24 genetic loci under recent by analyzing rare singletons in 3946 high-depth whole-genome sequencing data of Han Chinese. The include immune-related gene regions (MHC cluster, IGH STING1, PSG), alcohol metabolism-related (ADH1B, ALDH2, ALDH3B2), the olfactory perception OR4C16, which MHC ADH1B ALDH2 were also TOPMed WestLake Biobank. Among signals, cluster particularly interesting, favored allele variant 14_105737776_C_T (rs117518546, IgG1-G396R) promotes immune response, but increases risk an autoimmune disease systemic lupus erythematosus (SLE). It surprising that our newly discovered ALDH3B2 evolved opposite direction to metabolism. Besides monogenic traits, found multiple complex traits experienced polygenic adaptation. Particularly, multi-methods consistently revealed lower blood pressure was selection. Finally, built a database named RePoS (Recent Positive Selection, http://bigdata.ibp.ac.cn/RePoS/) integrate display multi-population signals. Our study extended adaptation Chinese as well other populations.

Язык: Английский

Процитировано

12

Ancient and Modern Genomes Reveal Microsatellites Maintain a Dynamic Equilibrium Through Deep Time DOI Creative Commons
Bennet J. McComish, Michael Charleston, Matthew Parks

и другие.

Genome Biology and Evolution, Год журнала: 2024, Номер 16(3)

Опубликована: Фев. 27, 2024

Microsatellites are widely used in population genetics, but their evolutionary dynamics remain poorly understood. It is unclear whether microsatellite loci drift length over time. This important because the mutation processes that underlie these genetic markers central to models employ microsatellites. We identify more than 27 million microsatellites using a novel and unique dataset of modern ancient Adélie penguin genomes along with data from 63 published chordate genomes. investigate 2 timescales: one based on samples dating ∼46.5 ka other diversification chordates aged 500 Ma. show process allele evolution at dynamic equilibrium; while there polymorphism among individuals, distribution for given locus remains stable. Many persist very long timescales, particularly exons regulatory sequences. These often retain variability, suggesting they may play role maintaining phenotypic variation within populations.

Язык: Английский

Процитировано

4