Computers in Biology and Medicine, Год журнала: 2023, Номер 164, С. 107288 - 107288
Опубликована: Авг. 1, 2023
Язык: Английский
Computers in Biology and Medicine, Год журнала: 2023, Номер 164, С. 107288 - 107288
Опубликована: Авг. 1, 2023
Язык: Английский
Wiley Interdisciplinary Reviews Computational Molecular Science, Год журнала: 2022, Номер 12(4)
Опубликована: Фев. 8, 2022
Abstract Drug development is time‐consuming and expensive. Repurposing existing drugs for new therapies an attractive solution that accelerates drug at reduced experimental costs, specifically Coronavirus Disease 2019 (COVID‐19), infectious disease caused by severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2). However, comprehensively obtaining productively integrating available knowledge big biomedical data to effectively advance deep learning models still challenging repurposing in other complex diseases. In this review, we introduce guidelines on how utilize methodologies tools repurposing. We first summarized the commonly used bioinformatics pharmacogenomics databases Next, discuss recently developed sequence‐based graph‐based representation approaches as well state‐of‐the‐art learning‐based methods. Finally, present applications of fight COVID‐19 pandemic outline its future challenges. This article categorized under: Data Science > Artificial Intelligence/Machine Learning
Язык: Английский
Процитировано
97Research, Год журнала: 2022, Номер 2022
Опубликована: Янв. 1, 2022
With the rapid development of biotechnology, number biological sequences has grown exponentially. The continuous expansion sequence data promotes application machine learning in to construct predictive models for mining information. There are many branches classification research. In this review, we mainly focus on function and modification based learning. Sequence-based prediction analysis basic tasks understand functions DNA, RNA, proteins, peptides. However, there hundreds developed sequences, quite varied specific methods seem dizzying at first glance. Here, aim establish a long-term support website (http://lab.malab.cn/~acy/BioseqData/home.html), which provides readers with detailed information method download links relevant datasets. We briefly introduce steps build an effective model framework data. addition, brief introduction single-cell sequencing applications biology is also included. Finally, discuss current challenges future perspectives
Язык: Английский
Процитировано
70Biology, Год журнала: 2023, Номер 12(7), С. 1033 - 1033
Опубликована: Июль 22, 2023
The emergence and rapid development of deep learning, specifically transformer-based architectures attention mechanisms, have had transformative implications across several domains, including bioinformatics genome data analysis. analogous nature sequences to language texts has enabled the application techniques that exhibited success in fields ranging from natural processing genomic data. This review provides a comprehensive analysis most recent advancements transformer mechanisms transcriptome focus this is on critical evaluation these techniques, discussing their advantages limitations context With swift pace learning methodologies, it becomes vital continually assess reflect current standing future direction research. Therefore, aims serve as timely resource for both seasoned researchers newcomers, offering panoramic view elucidating state-of-the-art applications field. Furthermore, paper serves highlight potential areas investigation by critically evaluating studies 2019 2023, thereby acting stepping-stone further research endeavors.
Язык: Английский
Процитировано
63Nature Communications, Год журнала: 2024, Номер 15(1)
Опубликована: Май 14, 2024
Abstract Nanopore direct RNA sequencing (DRS) has emerged as a powerful tool for modification identification. However, concurrently detecting multiple types of modifications in single DRS sample remains challenge. Here, we develop TandemMod, transferable deep learning framework capable data. To train high-performance TandemMod models, generate vitro epitranscriptome datasets from cDNA libraries, containing thousands transcripts labeled with various modifications. We validate the performance on both and vivo human cell lines, confirming its high accuracy profiling m 6 A 5 C sites. Furthermore, perform transfer identifying other such 7 G, Ψ, inosine, significantly reducing training data size running time without compromising performance. Finally, apply to identify 3 rice grown different environments, demonstrating applicability across species conditions. In summary, provide resource ground-truth labels that can serve benchmark nanopore-based identification methods, diverse using sample.
Язык: Английский
Процитировано
17Briefings in Bioinformatics, Год журнала: 2021, Номер 23(1)
Опубликована: Окт. 8, 2021
Conventional supervised binary classification algorithms have been widely applied to address significant research questions using biological and biomedical data. This scheme requires two fully labeled classes of data (e.g. positive negative samples) train a model. However, in many bioinformatics applications, labeling is laborious, the samples might be potentially mislabeled due limited sensitivity experimental equipment. The unlabeled (PU) learning was therefore proposed enable classifier learn directly from large number (i.e. mixture or samples). To date, several PU developed various questions, such as sequence identification, functional site characterization interaction prediction. In this paper, we revisit collection 29 state-of-the-art bioinformatic applications questions. Various important aspects are extensively discussed, including methodology, application, design evaluation strategy. We also comment on existing issues offer our perspectives for future development applications. anticipate that work serves an instrumental guideline better understanding framework further developing next-generation frameworks critical
Язык: Английский
Процитировано
56Molecular Therapy, Год журнала: 2023, Номер 31(8), С. 2543 - 2551
Опубликована: Июнь 3, 2023
5-methylcytosine (m5C) is indeed a critical post-transcriptional alteration that widely present in various kinds of RNAs and crucial to the fundamental biological processes. By correctly identifying m5C-methylation sites on RNA, clinicians can more clearly comprehend precise function these m5C-sites different Due their effectiveness affordability, computational methods have received greater attention over last few years for identification methylation species. To precisely identify RNA m5C locations five species including Homo sapiens, Arabidopsis thaliana, Mus musculus, Drosophila melanogaster, Danio rerio, we proposed effective accurate model named m5C-pred. create m5C-pred, distinct feature encoding techniques were combined extract features from sequence, then used SHapley Additive exPlanations choose best among them, followed by XGBoost as classifier. We applied novel optimization method called Optuna quickly efficiently determine hyperparameters. Finally, was evaluated using independent test datasets, compared results with previous methods. Our approach, m5C- pred, anticipated be useful accurately sites, outperforming currently available state-of-the-art techniques.
Язык: Английский
Процитировано
28The Innovation, Год журнала: 2023, Номер 4(4), С. 100452 - 100452
Опубликована: Май 29, 2023
•RNA modification is a novel hotspot of epigenetic research, affecting wide range physiological and pathological processes.•RNA plays an important role in tumor immunity.•RNA may be potential clinical therapeutic target to prevent immune escape. An immunosuppressive state typical feature the microenvironment. Despite dramatic success checkpoint inhibitor (ICI) therapy preventing cell escape from surveillance, primary acquired resistance have limited its use. Notably, recent trials shown that drugs can significantly improve outcome ICI various cancers, indicating importance modifications regulation tumors. Recently, RNA (N6-methyladenosine [m6A], N1-methyladenosine [m1A], 5-methylcytosine [m5C], etc.), areas been play crucial roles protumor antitumor immunity. In this review, we provide comprehensive understanding how m6A, m1A, m5C function immunity by directly regulating different cells as well indirectly through mechanisms, including modulating expression checkpoints, inducing metabolic reprogramming, secretion immune-related factors. Finally, discuss current status strategies targeting escape, highlighting their potential.
Язык: Английский
Процитировано
24bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2023, Номер unknown
Опубликована: Июль 12, 2023
A bstract RNA molecules play a crucial role as intermediaries in diverse biological processes. Attaining profound understanding of their function can substantially enhance our comprehension life’s activities and facilitate drug development for numerous diseases. The advent high-throughput sequencing technologies makes vast amounts sequence data accessible, which contains invaluable information knowledge. However, deriving insights further application from such an immense volume poses significant challenge. Fortunately, recent advancements pre-trained models have surfaced revolutionary solution addressing challenges owing to exceptional ability automatically mine extract hidden knowledge massive datasets. Inspired by the past successes, we developed novel context-aware deep learning model named Uni-RNA that performs pre-training on largest dataset sequences at unprecedented scale date. During this process, autonomously unraveled obscured evolutionary structural embedded within sequences. As result, through fine-tuning, achieved state-of-the-art (SOTA) performances spectrum downstream tasks, including both functional predictions. Overall, established new research paradigm empowered large field RNA, enabling community unlock power AI whole level significantly expedite pace foster groundbreaking discoveries.
Язык: Английский
Процитировано
24Information Sciences, Год журнала: 2024, Номер 660, С. 120105 - 120105
Опубликована: Янв. 9, 2024
Язык: Английский
Процитировано
12Experimental & Molecular Medicine, Год журнала: 2024, Номер 56(6), С. 1293 - 1321
Опубликована: Июнь 14, 2024
Abstract The exponential growth of big data in RNA biology (RB) has led to the development deep learning (DL) models that have driven crucial discoveries. As constantly evidenced by DL studies other fields, successful implementation RB depends heavily on effective utilization large-scale datasets from public databases. In achieving this goal, encoding methods, algorithms, and techniques align well with biological domain knowledge played pivotal roles. review, we provide guiding principles for applying these concepts various problems demonstrating examples associated methodologies. We also discuss remaining challenges developing suggest strategies overcome challenges. Overall, review aims illuminate compelling potential ways apply powerful technology investigate intriguing more effectively.
Язык: Английский
Процитировано
10