SQANTI-SIM: a simulator of controlled transcript novelty for lrRNA-seq benchmark DOI Creative Commons
Jorge Mestre-Tomás, Tianyuan Liu, Francisco J. Pardo-Palacios

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2023, Номер unknown

Опубликована: Авг. 24, 2023

Long-read RNA-seq has emerged as a powerful tool for transcript discovery, even in well-annotated organisms. However, assessing the accuracy of different methods identifying annotated and novel transcripts remains challenge. Here, we present SQANTI-SIM, versatile utility that wraps around popular long-read simulators to allow precise management novelty based on structural categories defined by SQANTI3. By selectively excluding specific from reference dataset, SQANTI-SIM effectively emulates scenarios involving unannotated transcripts. Furthermore, provides customizable features supports simulation additional types data, representing first multi-omics lrRNA-seq field. We demonstrate effectiveness benchmarking five transcriptome reconstruction pipelines using simulated data.

Язык: Английский

SQANTI3: curation of long-read transcriptomes for accurate identification of known and novel isoforms DOI Creative Commons
Francisco J. Pardo-Palacios, Ángeles Arzalluz-Luque, Liudmyla Kondratova

и другие.

Nature Methods, Год журнала: 2024, Номер 21(5), С. 793 - 797

Опубликована: Март 20, 2024

SQANTI3 is a tool designed for the quality control, curation and annotation of long-read transcript models obtained with third-generation sequencing technologies. Leveraging its framework, calculates descriptors models, junctions ends. With this information, potential artifacts can be identified replaced reliable sequences. Furthermore, integrated functional feature enables subsequent iso-transcriptomics analyses.

Язык: Английский

Процитировано

50

The Third-Generation Sequencing Challenge: Novel Insights for the Omic Sciences DOI Creative Commons

Carmela Scarano,

Iolanda Veneruso,

Rosa Redenta De Simone

и другие.

Biomolecules, Год журнала: 2024, Номер 14(5), С. 568 - 568

Опубликована: Май 10, 2024

The understanding of the human genome has been greatly improved by advent next-generation sequencing technologies (NGS). Despite undeniable advantages responsible for their widespread diffusion, these methods have some constraints, mainly related to short read length and need PCR amplification. As a consequence, long-read sequencers, called third-generation (TGS), developed, promising overcome NGS. Starting from first prototype, TGS progressively ameliorated its chemistries improving both base-calling accuracy, as well simultaneously reducing costs/base. Based on premises, is showing potential in many fields, including analysis difficult-to-sequence genomic regions, structural variations detection, RNA expression profiling, DNA methylation study, metagenomic analyses. Protocol standardization development easy-to-use pipelines data will enhance use, also opening way routine applications diagnostic contexts.

Язык: Английский

Процитировано

13

Mapping medically relevant RNA isoform diversity in the aged human frontal cortex with deep long-read RNA-seq DOI Creative Commons
Bernardo Aguzzoli Heberle, J. Anthony Brandon, Madeline L. Page

и другие.

Nature Biotechnology, Год журнала: 2024, Номер unknown

Опубликована: Май 22, 2024

Abstract Determining whether the RNA isoforms from medically relevant genes have distinct functions could facilitate direct targeting of for disease treatment. Here, as a step toward this goal neurological diseases, we sequenced 12 postmortem, aged human frontal cortices (6 Alzheimer cases and 6 controls; 50% female) using one Oxford Nanopore PromethION flow cell per sample. We identified 1,917 expressing multiple in cortex where 1,018 had with different protein-coding sequences. Of these genes, 57 are implicated brain-related diseases including major depression, schizophrenia, Parkinson’s disease. Our study also uncovered 53 new several isoform was most highly expressed that gene. reported on five mitochondrially encoded, spliced isoforms. found 99 differentially between controls.

Язык: Английский

Процитировано

13

Comprehensive assessment of mRNA isoform detection methods for long-read sequencing data DOI Creative Commons

Yaqi Su,

Zhejian Yu,

Siqian Jin

и другие.

Nature Communications, Год журнала: 2024, Номер 15(1)

Опубликована: Май 10, 2024

Abstract The advancement of Long-Read Sequencing (LRS) techniques has significantly increased the length sequencing to several kilobases, thereby facilitating identification alternative splicing events and isoform expressions. Recently, numerous computational tools for detection using long-read data have been developed. Nevertheless, there remains a deficiency in comparative studies that systemically evaluate performance these tools, which are implemented with different algorithms, under various simulations encompass potential influencing factors. In this study, we conducted benchmark analysis thirteen methods nine capable identifying structures from RNA-seq data. We evaluated their performances simulated data, represented diverse platforms generated by an in-house simulator, RNA sequins (sequencing spike-ins) as well experimental Our findings demonstrate IsoQuant highly effective tool LRS, Bambu StringTie2 also exhibiting strong performance. These results offer valuable guidance future research on ongoing improvement LRS

Язык: Английский

Процитировано

11

Long-read sequencing for 29 immune cell subsets reveals disease-linked isoforms DOI Creative Commons
Jun Inamo, Akari Suzuki, Mahoko Takahashi Ueda

и другие.

Nature Communications, Год журнала: 2024, Номер 15(1)

Опубликована: Май 28, 2024

Abstract Alternative splicing events are a major causal mechanism for complex traits, but they have been understudied due to the limitation of short-read sequencing. Here, we generate full-length isoform annotation human immune cells from an individual by long-read sequencing 29 cell subsets. This contains number unannotated transcripts and isoforms such as read-through transcript TOMM40-APOE in Alzheimer’s disease locus. We profile characteristics show that repetitive elements significantly explain diversity isoforms, providing insight into genome evolution. In addition, some expressed cell-type specific manner, whose alternative 3’-UTRs usage contributes their specificity. Further, identify disease-associated switch analysis integration several quantitative trait loci analyses with genome-wide association study data. Our findings will promote elucidation diseases via splicing.

Язык: Английский

Процитировано

9

Emerging and re-emerging themes in co-transcriptional pre-mRNA splicing DOI
Tucker J. Carrocci, Karla M. Neugebauer

Molecular Cell, Год журнала: 2024, Номер 84(19), С. 3656 - 3666

Опубликована: Окт. 1, 2024

Язык: Английский

Процитировано

7

Bridging the splicing gap in human genetics with long-read RNA sequencing: finding the protein isoform drivers of disease DOI
Peter J. Castaldi, Abdullah Abood, Charles R. Farber

и другие.

Human Molecular Genetics, Год журнала: 2022, Номер 31(R1), С. R123 - R136

Опубликована: Авг. 12, 2022

Abstract Aberrant splicing underlies many human diseases, including cancer, cardiovascular diseases and neurological disorders. Genome-wide mapping of quantitative trait loci (sQTLs) has shown that genetic regulation alternative is widespread. However, identification the corresponding isoform or protein products associated with disease-associated sQTLs challenging short-read RNA-seq, which cannot precisely characterize full-length transcript isoforms. Furthermore, contemporary sQTL interpretation often relies on reference annotations, are incomplete. Solutions to these issues may be found through integration newly emerging long-read sequencing technologies. Long-read offers capability sequence mRNA transcripts and, in some cases, link isoforms containing disease-relevant alterations. Here, we provide an overview approaches, use effects isoforms, linkage RNA protein-level functions comment future directions field. Based recent progress, promises part disease genetics toolkit discover treat causing rare complex diseases.

Язык: Английский

Процитировано

18

Long-read sequencing transcriptome quantification with lr-kallisto DOI Creative Commons
Rebekah K. Loving, Delaney K. Sullivan, Fairlie Reese

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2024, Номер unknown

Опубликована: Июль 19, 2024

RNA abundance quantification has become routine and affordable thanks to high-throughput “short-read” technologies that provide accurate molecule counts at the gene level. Similarly of definitive fulllength, transcript isoforms remained a stubborn challenge, despite its obvious biological significance across wide range problems. “Long-read” sequencing platforms now produce data-types can, in principle, drive isoform quantification. However some particulars contemporary long-read datatypes, together with complexity genetic variation, present bioinformatic challenges. We show here, using ONT data, fast data is possible it improved by exome capture. To perform quantifications we developed lr-kallisto, which adapts kallisto bulk single-cell RNA-seq methods for technologies.

Язык: Английский

Процитировано

3

SQANTI-SIM: a simulator of controlled transcript novelty for lrRNA-seq benchmark DOI Creative Commons
Jorge Mestre-Tomás, Tianyuan Liu, Francisco J. Pardo-Palacios

и другие.

Genome biology, Год журнала: 2023, Номер 24(1)

Опубликована: Дек. 11, 2023

Long-read RNA sequencing has emerged as a powerful tool for transcript discovery, even in well-annotated organisms. However, assessing the accuracy of different methods identifying annotated and novel transcripts remains challenge. Here, we present SQANTI-SIM, versatile that wraps around popular long-read simulators to allow precise management novelty based on structural categories defined by SQANTI3. By selectively excluding specific from reference dataset, SQANTI-SIM effectively emulates scenarios involving unannotated transcripts. Furthermore, provides customizable features supports simulation additional types data, representing first multi-omics lrRNA-seq field.

Язык: Английский

Процитировано

6

Comparison of Single-cell Long-read and Short-read Transcriptome Sequencing of Patient-derived Organoid Cells of ccRCC: Quality Evaluation of the MAS-ISO-seq Approach DOI Creative Commons
Natalia Zajac, Qin Zhang,

Anna Bratus-Neuschwander

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2024, Номер unknown

Опубликована: Март 15, 2024

Abstract Single-cell RNA sequencing is used in profiling gene expression differences between cells. Short-read platforms provide high throughput and high-quality information at the gene-level, but technique hindered by limited read length, failing providing an understanding of cell heterogeneity isoform level. This gap has recently been addressed long-read that opportunity to preserve full-length transcript during sequencing. To objectively evaluate obtained from both methods, we sequenced four samples patient-derived organoid cells clear renal carcinoma one healthy sample kidney on Illumina Novaseq 6000 PacBio Sequel IIe. For for each sample, cDNA was derived same 10x Genomics 3’ single-cell library. Here present technical characteristics datasets compare metrics gene-level information. We show two methods largely overlap results also identify sources variability which a set advantages disadvantages methods.

Язык: Английский

Процитировано

1