Computers in Biology and Medicine, Journal Year: 2025, Volume and Issue: 186, P. 109625 - 109625
Published: Jan. 4, 2025
Language: Английский
Computers in Biology and Medicine, Journal Year: 2025, Volume and Issue: 186, P. 109625 - 109625
Published: Jan. 4, 2025
Language: Английский
Genomics Proteomics & Bioinformatics, Journal Year: 2022, Volume and Issue: 21(4), P. 678 - 694
Published: Sept. 9, 2022
As the most pervasive epigenetic marker present on mRNAs and long non-coding RNAs (lncRNAs), N
Language: Английский
Citations
29Briefings in Bioinformatics, Journal Year: 2023, Volume and Issue: 24(3)
Published: May 1, 2023
A-to-I editing is the most prevalent RNA event, which refers to change of adenosine (A) bases inosine (I) in double-stranded RNAs. Several studies have revealed that can regulate cellular processes and associated with various human diseases. Therefore, accurate identification sites crucial for understanding RNA-level (i.e. transcriptional) modifications their potential roles molecular functions. To date, computational approaches site been developed; however, performance still unsatisfactory needs further improvement. In this study, we developed a novel stacked-ensemble learning model, ATTIC (A-To-I ediTing predICtor), accurately identify across three species, including Homo sapiens, Mus musculus Drosophila melanogaster. We first comprehensively evaluated 37 sequence-derived features combined 14 popular machine algorithms. Then, selected optimal base models build series stacked ensemble models. The final framework was based on improved by feature selection strategy specific species. Extensive cross-validation independent tests illustrate outperforms state-of-the-art tools predicting sites. also web server ATTIC, publicly available at http://web.unimelb-bioinfortools.cloud.edu.au/ATTIC/. anticipate be utilized as useful tool accelerate events help characterize post-transcriptional regulation.
Language: Английский
Citations
17Epigenetics, Journal Year: 2023, Volume and Issue: 18(1)
Published: June 18, 2023
Epitranscriptomic modifications have recently emerged into the spotlight of researchers due to their vast regulatory effects on gene expression and thereby cellular physiology pathophysiology. N6,2'-O-dimethyladenosine (m6Am) is one most prevalent chemical marks RNA dynamically regulated by writers (PCIF1, METTL4) erasers (FTO). The presence or absence m6Am in affects mRNA stability, regulates transcription, modulates pre-mRNA splicing. Nevertheless, its functions heart are poorly known. This review summarizes current knowledge gaps about modification regulators cardiac biology. It also points out technical challenges lists currently available techniques measure m6Am. A better understanding epitranscriptomic needed improve our molecular regulations which may lead novel cardioprotective strategies.
Language: Английский
Citations
16BMC Bioinformatics, Journal Year: 2024, Volume and Issue: 25(1)
Published: Jan. 17, 2024
Abstract Background Epi-transcriptome regulation through post-transcriptional RNA modifications is essential for all types. Precise recognition of critical understanding their functions and regulatory mechanisms. However, wet experimental methods are often costly time-consuming, limiting wide range applications. Therefore, recent research has focused on developing computational methods, particularly deep learning (DL). Bidirectional long short-term memory (BiLSTM), convolutional neural network (CNN), the transformer have demonstrated achievements in modification site prediction. BiLSTM cannot achieve parallel computation, leading to a training time, CNN learn dependencies distance sequence, Transformer lacks information interaction with sequences at different scales. This insight underscores necessity continued development natural language processing (NLP) DL devise an enhanced prediction framework that can effectively address challenges presented. Results study presents multi-scale self- cross-attention (MSCAN) identify methylation using NLP way. Experiment results twelve sites (m 6 A, m 1 5 C, U, Am, 7 G, Ψ, I, Cm, Gm, Um) reveal area under receiver operating characteristic MSCAN obtains respectively 98.34%, 85.41%, 97.29%, 96.74%, 99.04%, 79.94%, 76.22%, 65.69%, 92.92%, 92.03%, 95.77%, 89.66%, which better than state-of-the-art model. indicates model strong generalization capabilities. Furthermore, reveals association among types from perspective. A user-friendly web server predicting widely occurring human available http://47.242.23.141/MSCAN/index.php . Conclusions predictor been developed binary classification predict sites.
Language: Английский
Citations
7Nucleic Acids Research, Journal Year: 2022, Volume and Issue: 50(18), P. 10290 - 10310
Published: Sept. 26, 2022
Abstract As the most pervasive epigenetic mark present on mRNA and lncRNA, N6-methyladenosine (m6A) RNA methylation regulates all stages of life in various biological processes disease mechanisms. Computational methods for deciphering modification have achieved great success recent years; nevertheless, their potential remains underexploited. One reason this is that existing models usually consider only sequence transcripts, ignoring regions (or geography) transcripts such as 3′UTR intron, where forms functions. Here, we developed three simple yet powerful encoding schemes to capture submolecular geographic information RNA, which largely independent from sequences. We show m6A prediction based alone can achieve comparable performances classic sequence-based methods. Importantly, substantially enhances accuracy models, enables isoform- tissue-specific sites, improves signal detection direct sequencing data. The exhibited strong interpretability, are applicable not but also N1-methyladenosine (m1A), serve a general effective complement widely used deep learning applications concerning transcripts.
Language: Английский
Citations
23Briefings in Bioinformatics, Journal Year: 2023, Volume and Issue: 24(3)
Published: May 1, 2023
Abstract The expanding field of epitranscriptomics might rival the epigenome in diversity biological processes impacted. In recent years, development new high-throughput experimental and computational techniques has been a key driving force discovering properties RNA modifications. Machine learning applications, such as for classification, clustering or de novo identification, have critical these advances. Nonetheless, various challenges remain before full potential machine can be leveraged. this review, we provide comprehensive survey methods to detect modifications using diverse input data sources. We describe strategies train test encode interpret features that are relevant epitranscriptomics. Finally, identify some current open questions about modification analysis, including ambiguity predicting transcript isoforms single nucleotides, lack complete ground truth sets believe review will inspire benefit rapidly developing addressing limitations through effective use learning.
Language: Английский
Citations
15IEEE/ACM Transactions on Computational Biology and Bioinformatics, Journal Year: 2023, Volume and Issue: 20(3), P. 2177 - 2189
Published: Jan. 17, 2023
Recent work on language models has resulted in state-of-the-art performance various tasks. Among these, Bidirectional Encoder Representations from Transformers (BERT) focused contextualizing word embeddings to extract context and semantics of the words. On other hand, post-transcriptional 2'-O-methylation (Nm) RNA modification is important cellular tasks related a number diseases. The existing high-throughput experimental techniques take longer time detect these modifications, costly exploring functional processes. Here, deeply understand associated biological processes faster, we come up with an efficient method Bert2Ome infer sites sequences. combines BERT-based model convolutional neural networks (CNN) relationship between sequence content. Unlike methods proposed so far, assumes each given as text focuses improving prediction by integrating pretrained deep learning-based BERT. Additionally, our transformer-based approach could across multiple species. According 5-fold cross-validation, human mouse accuracies were 99.15% 94.35% respectively. Similarly, ROC AUC scores 0.99, 0.94 for same Detailed results show that reduces consumed experiments outperforms approaches different datasets species over metrics. learning such 2D CNNs are more promising BERT attributes than conventional machine methods.
Language: Английский
Citations
14Computers in Biology and Medicine, Journal Year: 2023, Volume and Issue: 164, P. 107238 - 107238
Published: July 8, 2023
Language: Английский
Citations
13Methods, Journal Year: 2022, Volume and Issue: 203, P. 399 - 421
Published: March 3, 2022
Language: Английский
Citations
20International Journal of Molecular Sciences, Journal Year: 2022, Volume and Issue: 23(19), P. 11026 - 11026
Published: Sept. 20, 2022
N6,2'-O-dimethyladenosine (m6Am) is a post-transcriptional modification that may be associated with regulatory roles in the control of cellular functions. Therefore, it crucial to accurately identify transcriptome-wide m6Am sites understand underlying m6Am-dependent mRNA regulation mechanisms and biological Here, we used three sequence-based feature-encoding schemes, including one-hot, nucleotide chemical property (NCP), density (ND), represent RNA sequence samples. Additionally, proposed an ensemble deep learning framework, named DLm6Am, sites. DLm6Am consists similar base classifiers, each which contains multi-head attention module, embedding module two parallel sub-modules, convolutional neural network (CNN) Bi-directional long short-term memory (BiLSTM), prediction module. To demonstrate superior performance our model's architecture, compared multiple model frameworks method by analyzing training data independent testing data. existing state-of-the-art computational methods, m6AmPred MultiRM. The accuracy (ACC) for was improved 6.45% 8.42% MultiRM on data, respectively, while area under receiver operating characteristic curve (AUROC) increased 4.28% 5.75%, respectively. All results indicate achieved best terms ACC, Matthews correlation coefficient (MCC), AUROC, precision recall curves (AUPR). further assess generalization model, implemented chromosome-level leave-out cross-validation, found obtained AUROC values were greater than 0.83, indicating robust can predict
Language: Английский
Citations
20