A personalized multi-platform assessment of somatic mosaicism in the human frontal cortex DOI Creative Commons
Weichen Zhou, Camille Mumm, Yanming Gan

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: Dec. 21, 2024

Somatic mutations in individual cells lead to genomic mosaicism, contributing the intricate regulatory landscape of genetic disorders and cancers. To evaluate refine detection somatic mosaicism across different technologies with personalized donor-specific assembly (DSA), we obtained tissue from dorsolateral prefrontal cortex (DLPFC) a post-mortem neurotypical 31-year-old individual. We sequenced bulk DLPFC using Oxford Nanopore Technologies (~60X), NovaSeq (~30X), linked-read sequencing (~28X). Additionally, applied Cas9 capture methodology coupled long-read (TEnCATS), targeting active transposable elements. also isolated amplified DNA flow-sorted single neurons MALBAC, 115 these MALBAC libraries on 94 NovaSeq. constructed haplotype-resolved total length 5.77 Gb phase block 2.67 Mb (N50) facilitate cross-platform analysis variations. observed an increase phasing rate 11.6% 38.0% between short-read technologies. By generating catalog phased germline SNVs, CNVs, TEs assembled genome, standard approaches recall variants achieved aggregated rates 97.3% 99.4% based data, setting upper bound for limits. Moreover, utilizing haplotype-based DSA, remarkable reduction false positive calls tissue, ranging 14.9% 72.4%. developed pipelines leveraging DSA information enhance large variant calling cells. examining variation long-reads neurons, identified 468 candidate heterozygous deletions (1.5Mb - 20Mb), 137 which intersected single-cell data. 61 putative (60 Alus, one LINE-1) Collectively, our spans calling, providing comprehensive ab initio ad finem approach resource real human tissue.

Language: Английский

A Hitchhiker's Guide to long-read genomic analysis DOI
Medhat Mahmoud, Daniel Paiva Agustinho, Fritz J. Sedlazeck

et al.

Genome Research, Journal Year: 2025, Volume and Issue: 35(4), P. 545 - 558

Published: April 1, 2025

Over the past decade, long-read sequencing has evolved into a pivotal technology for uncovering hidden and complex regions of genome. Significant cost efficiency, scalability, accuracy advancements have driven this evolution. Concurrently, novel analytical methods emerged to harness full potential long reads. These enabled milestones such as first fully completed human genome, enhanced identification understanding genomic variants, deeper insights interplay between epigenetics variation. This mini-review provides comprehensive overview latest developments in DNA analysis, encompassing reference-based de novo assembly approaches. We explore entire workflow, from initial data processing variant calling annotation, focusing on how these improve our ability interpret wide array variants. Additionally, we discuss current challenges, limitations, future directions field, offering detailed examination state-of-the-art bioinformatics sequencing.

Language: Английский

Citations

1

TopoQual polishes circular consensus sequencing data and accurately predicts quality scores DOI Creative Commons
Minindu Weerakoon, Sangjin Lee, Emily Mitchell

et al.

BMC Bioinformatics, Journal Year: 2025, Volume and Issue: 26(1)

Published: Jan. 16, 2025

Abstract Background Pacific Biosciences (PacBio) circular consensus sequencing (CCS), also known as high fidelity (HiFi) technology, has revolutionized modern genomics by producing long (10 + kb) and highly accurate reads. This is achieved circularized DNA molecules multiple times combining them into a sequence. Currently, the accuracy quality value estimation provided HiFi technology are more than sufficient for applications such genome assembly germline variant calling. However, there limitations in of estimated scores when it comes to somatic calling on single Results To address challenge inaccurate calling, we introduce TopoQual, novel tool designed enhance base predictions. TopoQual leverages techniques including partial order alignments (POA), topologically parallel bases, deep learning algorithms polish sequences. Our results demonstrate that corrects approximately 31.9% errors PacBio Additionally, validates qualities up q59, which corresponds one error 0.9 million bases. These improvements will significantly reliability using data. Conclusion represents significant advancement improving predictions By correcting substantial proportion achieving validation, enables confident not only addresses critical limitation current but opens new possibilities precise genomic analysis various research clinical applications.

Language: Английский

Citations

0

Severus detects somatic structural variation and complex rearrangements in cancer genomes using long-read sequencing DOI
Ayşe Gökçe Keşküş, Asher Bryant, Tanveer Ahmad

et al.

Nature Biotechnology, Journal Year: 2025, Volume and Issue: unknown

Published: April 4, 2025

Language: Английский

Citations

0

A personalized multi-platform assessment of somatic mosaicism in the human frontal cortex DOI Creative Commons
Weichen Zhou, Camille Mumm, Yanming Gan

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: Dec. 21, 2024

Somatic mutations in individual cells lead to genomic mosaicism, contributing the intricate regulatory landscape of genetic disorders and cancers. To evaluate refine detection somatic mosaicism across different technologies with personalized donor-specific assembly (DSA), we obtained tissue from dorsolateral prefrontal cortex (DLPFC) a post-mortem neurotypical 31-year-old individual. We sequenced bulk DLPFC using Oxford Nanopore Technologies (~60X), NovaSeq (~30X), linked-read sequencing (~28X). Additionally, applied Cas9 capture methodology coupled long-read (TEnCATS), targeting active transposable elements. also isolated amplified DNA flow-sorted single neurons MALBAC, 115 these MALBAC libraries on 94 NovaSeq. constructed haplotype-resolved total length 5.77 Gb phase block 2.67 Mb (N50) facilitate cross-platform analysis variations. observed an increase phasing rate 11.6% 38.0% between short-read technologies. By generating catalog phased germline SNVs, CNVs, TEs assembled genome, standard approaches recall variants achieved aggregated rates 97.3% 99.4% based data, setting upper bound for limits. Moreover, utilizing haplotype-based DSA, remarkable reduction false positive calls tissue, ranging 14.9% 72.4%. developed pipelines leveraging DSA information enhance large variant calling cells. examining variation long-reads neurons, identified 468 candidate heterozygous deletions (1.5Mb - 20Mb), 137 which intersected single-cell data. 61 putative (60 Alus, one LINE-1) Collectively, our spans calling, providing comprehensive ab initio ad finem approach resource real human tissue.

Language: Английский

Citations

0