Inverted duplicate DNA sequences increase translocation rates through sequencing nanopores resulting in reduced base calling accuracy DOI Creative Commons
Pieter Spealman,

Jaden Burrell,

David Gresham

и другие.

Nucleic Acids Research, Год журнала: 2020, Номер 48(9), С. 4940 - 4945

Опубликована: Апрель 3, 2020

Abstract Inverted duplicated DNA sequences are a common feature of structural variants (SVs) and copy number (CNVs). Analysis CNVs containing inverted using nanopore sequencing identified recurrent aberrant behavior characterized by low confidence, incorrect missed base calls. duplicate in both yeast human samples were observed to have systematic elevation the electrical current detected at nanopore, increased translocation rates decreased sampling rates. The coincidence with dramatically reduced accuracy an rate suggests that secondary structures may interfere dynamics transit through nanopore.

Язык: Английский

A comprehensive examination of Nanopore native RNA sequencing for characterization of complex transcriptomes DOI Creative Commons
Charlotte Soneson, Yao Yao, Anna Bratus-Neuenschwander

и другие.

Nature Communications, Год журнала: 2019, Номер 10(1)

Опубликована: Июль 31, 2019

A platform for highly parallel direct sequencing of native RNA strands was recently described by Oxford Nanopore Technologies, but despite initial efforts it remains crucial to further investigate the technology quantification complex transcriptomes. Here we undertake polyA + from two human cell lines, analysing ~5.2 million aligned reads. To enable informative comparisons, also perform relevant ONT cDNA- and Illumina-sequencing. We find that while does some anticipated advantages, key unexpected aspects currently hamper its performance, most notably quite frequent inability obtain full-length transcripts single reads, as well difficulties unambiguously infer their true transcript origin. While characterising issues need be addressed when investigating more transcriptomes, our study highlights with defined improvements, could an important addition mammalian transcriptomics toolbox.

Язык: Английский

Процитировано

223

Quantitative profiling of N6-methyladenosine at single-base resolution in stem-differentiating xylem of Populus trichocarpa using Nanopore direct RNA sequencing DOI Creative Commons
Yubang Gao, Xuqing Liu,

Bizhi Wu

и другие.

Genome biology, Год журнала: 2021, Номер 22(1)

Опубликована: Янв. 7, 2021

Abstract There are no comprehensive methods to identify N 6 -methyladenosine (m A) at single-base resolution for every single transcript, which is necessary the estimation of m A abundance. We develop a new pipeline called Nanom6A identification and quantification modification using Nanopore direct RNA sequencing based on an XGBoost model. validate our method methylated immunoprecipitation (MeRIP-Seq) A-sensitive RNA-endoribonuclease–facilitated (m6A-REF-seq), confirming high accuracy. Using this method, we provide transcriptome-wide in stem-differentiating xylem reveal that different alternative polyadenylation (APA) usage shows ratio A.

Язык: Английский

Процитировано

149

Alternative polyadenylation: methods, mechanism, function, and role in cancer DOI Creative Commons
Yi Zhang, Lian Liu, Qiongzi Qiu

и другие.

Journal of Experimental & Clinical Cancer Research, Год журнала: 2021, Номер 40(1)

Опубликована: Фев. 1, 2021

Occurring in over 60% of human genes, alternative polyadenylation (APA) results numerous transcripts with differing 3'ends, thus greatly expanding the diversity mRNAs and proteins derived from a single gene. As key molecular mechanism, APA is involved various gene regulation steps including mRNA maturation, stability, cellular RNA decay, protein diversification. frequently dysregulated cancers leading to changes oncogenes tumor suppressor expressions. Recent studies have revealed regulatory mechanisms that promote development progression number diseases, cancer. Here, we provide an overview four types their impacts on regulation. We focus particularly interaction microRNAs, binding other related factors, core pre-mRNA 3'end processing complex, 3'UTR length change. also describe next-generation sequencing methods computational tools for use poly(A) signal detection repositories databases. Finally, summarize current understanding cancer our vision future research.

Язык: Английский

Процитировано

107

De novo basecalling of RNA modifications at single molecule and nucleotide resolution DOI Creative Commons
Sonia Cruciani,

Anna Delgado-Tejedor,

Leszek P. Pryszcz

и другие.

Genome biology, Год журнала: 2025, Номер 26(1)

Опубликована: Фев. 25, 2025

Abstract RNA modifications influence function and fate, but detecting them in individual molecules remains challenging for most modifications. Here we present a novel methodology to generate training sets build modification-aware basecalling models. Using this approach, develop the m 6 ABasecaller , model that predicts A from raw nanopore signals. We validate its accuracy vitro vivo, revealing stable modification stoichiometry across isoforms, co-occurrence within molecules, A-dependent effects on poly(A) tails. Finally, demonstrate our method generalizes other DNA modifications, paving path towards future efforts

Язык: Английский

Процитировано

3

Beyond sequencing: machine learning algorithms extract biology hidden in Nanopore signal data DOI Creative Commons
Yuk Kei Wan,

Christopher Hendra,

Ploy N. Pratanwanich

и другие.

Trends in Genetics, Год журнала: 2021, Номер 38(3), С. 246 - 257

Опубликована: Окт. 25, 2021

Nanopore sequencing accuracy has increased to 98.3% as new-generation base callers replace early generation hidden Markov model basecalling algorithms with neural network algorithms.Machine learning methods can classify sequences in real-time, allowing targeted nanopore's ReadUntil feature.Machine and statistical testing tools detect DNA modifications by analyzing ion current signals from nanopore direct sequencing.Nanopore RNA profiles RNAs their modification retained, which influences the emitted nanopore.Machine analyze sequencing, enabling detection, secondary structure prediction, poly(A) tail length estimation. provides signal data corresponding nucleotide motifs sequenced. Through machine learning-based methods, these are translated into long-read that overcome read size limit of short-read sequencing. However, raw many more opportunities beyond just genomes transcriptomes: use approaches extract biological information allow detection modifications, estimation length, prediction structures. In this review, we discuss how developments methodologies contributed accurate lower error rates, enable new discoveries. We argue a dimensionality for genomics experiments highlight challenges future directions computational additional provided data. High-throughput played pivotal role broadening our understanding biology. Short-read technologies have advanced genetic diversity [1.1000 Genomes Project Consortium et al.A global reference human variation.Nature. 2015; 526: 68-74Google Scholar,2.Wu D. al.Large-scale whole-genome three diverse Asian populations Singapore.Cell. 2019; 179: 736-749.e15Google Scholar], insights transcriptomes cell healthy [3.GTEx The Genotype-Tissue Expression (GTEx) project.Nat. Genet. 2013; 45: 580-585Google Scholar,4.Regev A. al.The Human Cell Atlas.eLife. 2017; 6e27041Google helped deciphering disease biology [5.Weinstein J.N. Cancer Genome Atlas Pan-Cancer analysis 1113-1120Google Scholar, 6.PCAWG Transcriptome Core Group al.Genomic basis alterations cancer.Nature. 2020; 578: 129-136Google 7.ICGC/TCGA Analysis Whole Pan-cancer whole genomes.Nature. 82-93Google 8.Hoadley K.A. al.Cell-of-origin patterns dominate molecular classification 10,000 tumors 33 types cancer.Cell. 2018; 173: 291-304.e6Google Scholar]. On top epigenetic influence gene expression [9.Allis C.D. Jenuwein T. hallmarks control.Nat. Rev. 2016; 17: 487-500Google Scholar] epitranscriptomic (see Glossary) impact processing, stability, translation efficiency [10.Roundtree I.A. al.Dynamic regulation.Cell. 169: 1187-1200Google By coupling high-throughput wet lab techniques, such MeRIP (methylated immunoprecipitation)-seq [11.Meyer K.D. al.Comprehensive mRNA methylation reveals enrichment 3′ UTRs near stop codons.Cell. 2012; 149: 1635-1646Google miCLIP (m6A individual-nucleotide-resolution cross-linking [12.Linder B. al.Single-nucleotide-resolution mapping m6A m6Am throughout transcriptome.Nat. Methods. 12: 767-772Google bisulfite [13.Frommer M. genomic protocol yields positive display 5-methylcytosine residues individual strands.Proc. Natl. Acad. Sci. U. S. 1992; 89: 1827-1831Google profiling [14.Novoa E.M. al.Charting unknown epitranscriptome.Nat. Mol. Biol. 18: 339-340Google Although on easily scalable strategies, involves highly specialized protocols. Oxford Technologies (ONT) method (nanopore sequencing) allows genome epigenome, or transcriptome epitranscriptome single assay [15.Garalde D.R. al.Highly parallel an array nanopores.Nat. 15: 201-206Google Scholar,16.Rand A.C. al.Mapping sequencing.Nat. 14: 411-413Google generates long reads each molecule directly translocates through nanopore. As nucleic acids move nanopores different combinations, changes electrical measured (Figure 1A ). This not only enables determination sequence bases, but also structures developed purposes. Because complex nature signal, been key extracting layers information. will provide overview facilitate GitHub page described i. concepts basecalling, illustrate they applied introduce supervised unsupervised identifying Finally, outlook should further discovery methods. Basecalling is process translates bases 1B). correspond one five (RNA) six (DNA) (k-mer) another during translocation acid pore. noisy makes determining associating k-mers based solely difficult share similar ranges values, especially true presence homopolymers [17.Branton potential Biotechnol. 2008; 26: 1146-1153Google Early basecallers employ error-prone time-consuming segmentation process, divides series k-mer-corresponding segments [18.Teng H. al.Chiron: translating using deep learning.Gigascience. 7giy037Google These generate 85% ii. Since then, improvements major driver increase accuracy, achieving over correctly identified basesiii. first including ONT's cloud-based Metrichoriv open-source software Nanocall [19.David al.Nanocall: open source basecaller data.Bioinformatics. 33: 49-55Google offline alternative Metrichor, utilize (HMM) decoding Assuming moves pore at time, HMM-based treat chain observable events while states within HMM [20.Timp W. al.DNA base-calling Viterbi algorithm.Biophys. J. 102: L37-L39Google nucleotides state overlap last previous state, joint probabilities be calculated, path maximum total probability represents final predicted 2A To improve sequence, algorithm PoreSeq introduces artificial mutations replaces short regions original best same mutated having higher [21.Szalay Golovchenko J.A. De novo variant calling PoreSeq.Nat. 1087-1091Google predict short-range dependencies k-mer its next, may overlook long-range Furthermore, inaccurately describes expected values cause biases [22.Boža V. al.DeepNano: recurrent networks MinION reads.PLoS One. 12e0178751Google constraints, Albacore (prior version 2.0.1)v nanonetvi, DeepNano BasecRAWller [23.Stoiber Brown BasecRAWller: streaming signal.bioRxiv. (Published online May 1, 2017)https://doi.org/10.1101/133058Google (RNN) framework basecalling. A unidirectional RNN takes input vector calculate associated distribution 2B). Albacore, nanonet, bidirectional RNN, incorporates 2C). Still, RNNs time consuming; therefore, BasecRAWller, aim achieve real-time uses two both segment basecall fashion, resulting overall faster run depend define boundaries sharp change 1). Segmentation prone due varying speed address this, segmentation-free developed, 2.0.1v Chiron eliminate step, combines convolutional (CNN) features predicting probability. Then, it implements connectionist temporal (CTC) decoder select highest position 2D) does many-to-one finalize complete Chiron's approach outperforms segmentation-dependent framework's reliance results points running time. up Causalcall processing inputting segmented measurements matrix network, models calculates occurrence point. It CTC output fixed-size overlaps [24.Zeng al.Causalcall: network.Front. 10: 1332Google combination CNN used research Bonitovii, achieved unprecedentedly high 98.3%, making comparable next-generation sequencingiii. unique feature eject real thereby free specific interest. determine whether target requires rapid few possible reads. depth regions, applications sequencing-based diagnosis novel microbial metagenomic samples [25.Payne al.Readfish gigabase-sized genomes.Nat. 2021; 39: 442-450Google Scholar,26.Kovaka al.Targeted UNCALLED.Nat. 9: 431-441Google Approaches utilizing includes Readfish UNCALLED [26.Kovaka SquiggleNet [27.Bao Y. al.Real-time, SquiggleNet.bioRxiv. January 20, 2021)https://doi.org/10.1101/2021.01.15.426907Google pipeline guppy, aligns reference, then decides pores Similar success seen converts (k-mers) searches matches consistent event-matched k-mers. After clustering coordinates, filters out false positives reports best-supported location Using neural-network framework, was learned training allowed real-time. application effectively cancer genes leading interest without requirement experiments. Along basecalled sequences, downstream analyses require sequence-aligned inputs 1C). performs signal-to-reference alignment. Two performing tombo's resquiggleviii (previously nanoraw [28.Stoiber al.De identification enabled genome-guided processing.bioRxiv. December 15, 2016)https://doi.org/10.1101/094672Google Scholar]) nanopolish's eventalign [29.Loman N.J. bacterial assembled de data.Nat. 733-735Google Tombo's resquiggle identifies event large shifts level occupies Tombo assigns dynamic warping algorithm. Nanopolish's adaptive banded alignment most likely read. aligning extraction analyses. Analyzing genome-aligned and/or chromatin accessibility 1D). common include N4-cytosine (4mC), (5mC), 5-hydroxymethylcytosine (5hmC), N6-methyladenine (6mA) 29.Loman 30.Liu Q. al.NanoMod: tool data.BMC Genomics. 20: 78Google 31.Simpson J.T. al.Detecting cytosine 407-410Google 32.Ni P. al.DeepSignal: detecting deep-learning.Bioinformatics. 35: 4586-4595Google 33.Lee I. al.Simultaneous lines 1191-1199Google 34.McIntyre A.B.R. al.Single-molecule materials.Nat. Commun. 579Google 35.Jin Z. Liu diseases.Genes Diseases. 5: 1-8Google 36.Flusberg B.A. al.Direct single-molecule, 2010; 7: 461-465Google 37.Shah K. al.Adenine Drosophila tissue-specific developmental regulatory genes.G3. 1893-1900Google Scholar]x, known regulate transcription alter processes, some them clinical relevance [35.Jin GpC [33.Lee do rely data, discover infer manner (Table 1).Table 1Overview (a comprehensive list available onlinei)ApplicationToolDataType/modification analysisRefsInfer accessibilitynanoNOMe (nanopolish extension)DNACpG, methylation[33.Lee Scholar]Detect modificationnanopolish call-methylationDNA5mC[31.Simpson Scholar]SignalAlignDNA5mC,5hmC,6mA[16.Rand Scholar]mCallerDNA6mA[34.McIntyre Scholar]DeepSignalsDNA6mA[32.Ni Scholar]NanoModDNADe detection[30.Liu modificationtombo detect_modificationsDNA/RNAAlternate detection[28.Stoiber Scholar]MINESRNAm6A[42.Lorenz D.A. mA endogenous transcript isoforms base-specific resolution.RNA. 19-28Google Scholar]EpiNanoRNAm6A[43.Liu al.Accurate native sequences.Nat. 4079Google Scholar]Nanom6ARNAm6A[44.Gao al.Quantitative N-methyladenosine single-base resolution stem-differentiating xylem Populus trichocarpa sequencing.Genome 22: 22Google Scholar]m6anetRNAm6A[45.Hendra C. al.Detection Multiple Instance Learning framework.bioRxiv. September 22, 2021. https://doi.org/10.1101/2021.09.20.461055)Google Scholar]nano-IDRNA5-EU[46.Maier K.C. al.Native nano-ID synthesis stability isoforms.Genome Res. 30: 1332-1344Google Scholar]nanoRMSRNAΨ, Nm, comparative detection[47.Begik O. pseudouridylation dynamics 13, 2021)https://doi.org/10.1038/s41587-021-00915-6Google Scholar]YanocompRNAComparative detection[48.Parker M.T. al.Yanocomp: robust reads.bioRxiv. June 16, 2021)https://doi.org/10.1101/2021.06.15.448494Google Scholar]DiffErrRNAComparative detectionixELIGOSRNAComparative detection[49.Jenjaroenpun al.Decoding epitranscriptional landscape sequences.Nucleic Acids 49e7Google Scholar]nanoDocRNAComparative detection[50.Ueda nanoDoc: Deep One-Class Classification.bioRxiv. 2020)https://doi.org/10.1101/2020.09.13.295089Google Scholar]nanocomporeRNAComparative detection[51.Leger al.RNA sequencing.bioRxiv. November 2019)https://doi.org/10.1101/843136Google Scholar]DRUMMERRNAComparative detection[52.Price A.M. adenovirus necessary efficient splicing.Nat. 11: 6016Google Scholar]xPoreRNADifferential rate analysis[53.Pratanwanich P.N. al.Identification differential xPore.Nat. July 2021)https://doi.org/10.1038/s41587-021-00949-wGoogle Scholar]Predict 2° structurenanoSHAPERNARNA (Nm, 2′-O-acetyl)[54.Stephenson 01, 2020)https://doi.org/10.1101/2020.05.31.126763Google Scholar]PORE-cupineRNARNA (NAI-N3)[55.Aw J.G.A. al.Determination isoform-specific reads.Nat. 336-346Google Scholar]Estimate lengthnanopolish polyaRNAPolyA tails[56.Workman R.E. al.Nanopore 16: 1297-1305Google Scholar]tailfindrRNAPolyA Scholar,57.Krause al.tailfindr: alignment-free measurement sequencing.RNA. 25: 1229-1241Google Open table tab trained sets experimentally validated sites (labeled data). Labels modified cytosines, inferring obtained [16.Rand Scholar,31.Simpson labels artificially methylated methyltransferases orthogonal PacBio naturally existing [34.McIntyre Scholar,36.Flusberg Supervised nanopolish, signalAlign, mCaller, DeepSignals, detect_modifications module's mode D

Язык: Английский

Процитировано

82

FlsnRNA-seq: protoplasting-free full-length single-nucleus RNA profiling in plants DOI Creative Commons
Yanping Long, Zhijian Liu, Jinbu Jia

и другие.

Genome biology, Год журнала: 2021, Номер 22(1)

Опубликована: Фев. 19, 2021

Abstract The broad application of single-cell RNA profiling in plants has been hindered by the prerequisite protoplasting that requires digesting cell walls from different types plant tissues. Here, we present a protoplasting-free approach, flsnRNA-seq, for large-scale full-length at single-nucleus level using isolated nuclei. Combined with 10x Genomics and Nanopore long-read sequencing, validate robustness this approach Arabidopsis root cells developing endosperm. Sequencing results demonstrate it allows uncovering alternative splicing polyadenylation-related isoform information level, which facilitates characterizing identities.

Язык: Английский

Процитировано

80

Molecular barcoding of native RNAs using nanopore sequencing and deep learning DOI Creative Commons
Martin A. Smith,

Tansel Ersavas,

James M. Ferguson

и другие.

Genome Research, Год журнала: 2020, Номер 30(9), С. 1345 - 1353

Опубликована: Сен. 1, 2020

Nanopore sequencing enables direct measurement of RNA molecules without conversion to cDNA, thus opening the gates a new era for biology. However, lack molecular barcoding nanopore data sets severely affects applicability this technology biological samples, where availability is often limited. Here, we provide first experimental protocol and associated algorithm barcode demultiplex sets. Specifically, present novel robust approach accurately classify raw signal by transforming current intensities into images or arrays pixels, followed classification using deep learning algorithm. We demonstrate power strategy developing demultiplexing libraries. Our method, DeePlexiCon, can 93% reads with 95.1% accuracy 60% 99.9% accuracy. The an efficient simple multiplexing native will improve cost-effectiveness technology, as well facilitate analysis lower-input samples. Overall, our work exemplifies power, simplicity, robustness signal-to-image learning.

Язык: Английский

Процитировано

77

Measuring the tail: Methods for poly(A) tail profiling DOI Creative Commons
Aleksandra Brouze, Paweł S. Krawczyk, Andrzej Dziembowski

и другие.

Wiley Interdisciplinary Reviews - RNA, Год журнала: 2022, Номер 14(1)

Опубликована: Май 26, 2022

Abstract The 3′‐end poly(A) tail is an important and potent feature of most mRNA molecules that affects fate translation efficiency. Polyadenylation a posttranscriptional process occurs in the nucleus by canonical polymerases (PAPs). In some specific instances, can also be extended cytoplasm noncanonical (ncPAPs). This epitranscriptomic regulation recently became one interesting aspects field. Advances RNA sequencing technologies software development have allowed precise measurement tails, identification new ncPAPs, expansion function known enzymes, discovery better understanding physiological role heterogeneity, recognition correlation between length translatability. Here, we summarize polyadenylation research methods, including classic low‐throughput approaches, Illumina‐based genome‐wide analysis, advanced state‐of‐art techniques utilize long‐read third‐generation with Pacific Biosciences Oxford Nanopore Technologies platforms. A boost technical opportunities over recent decades has gene expression at level. article categorized under: Methods > Analyses Vitro Silico

Язык: Английский

Процитировано

41

Nano3P-seq: transcriptome-wide analysis of gene expression and tail dynamics using end-capture nanopore cDNA sequencing DOI Creative Commons
Oguzhan Begik, Gregor Diensthuber, Huanle Liu

и другие.

Nature Methods, Год журнала: 2022, Номер 20(1), С. 75 - 85

Опубликована: Дек. 19, 2022

Abstract RNA polyadenylation plays a central role in maturation, fate, and stability. In response to developmental cues, polyA tail lengths can vary, affecting the translation efficiency stability of mRNAs. Here we develop Nanopore 3′ end-capture sequencing (Nano3P-seq), method that relies on nanopore cDNA simultaneously quantify abundance, composition, length dynamics at per-read resolution. By employing template-switching-based protocol, Nano3P-seq sequence molecule from its end, regardless status, without need for PCR amplification or ligation adapters. We demonstrate provides quantitative estimates abundance lengths, captures wide diversity biotypes. find that, addition mRNA long non-coding RNA, tails be identified 16S mitochondrial ribosomal both mouse zebrafish models. Moreover, show are dynamically regulated during vertebrate embryogenesis an isoform-specific level, correlating with decay. Finally, ability capturing non-A bases within various reveal their distribution embryogenesis. Overall, is simple robust accurately estimating transcript levels, composition heterogeneity individual reads, minimal library preparation biases, coding transcriptome.

Язык: Английский

Процитировано

40

Diagnostics and analysis of SARS-CoV-2: current status, recent advances, challenges and perspectives DOI Creative Commons
Tao Dong, Mingyang Wang,

Junchong Liu

и другие.

Chemical Science, Год журнала: 2023, Номер 14(23), С. 6149 - 6206

Опубликована: Янв. 1, 2023

The disastrous spread of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has induced public healthcare issues and weakened the global economy significantly. Although SARS-CoV-2 infection is not as fatal initial outbreak, many infected victims suffer from long COVID. Therefore, rapid large-scale testing critical in managing patients alleviating its transmission. Herein, we review recent advances techniques to detect SARS-CoV-2. sensing principles are detailed together with their application domains analytical performances. In addition, advantages limits each method discussed analyzed. Besides molecular diagnostics antigen antibody tests, also neutralizing antibodies emerging variants. Further, characteristics mutational locations different variants epidemiological features summarized. Finally, challenges possible strategies prospected develop new assays meet diagnostic needs. Thus, this comprehensive systematic detection technologies may provide insightful guidance direction for developing tools diagnosis analysis support effective long-term pandemic management control.

Язык: Английский

Процитировано

39