Deep Learning for Elucidating Modifications to RNA—Status and Challenges Ahead DOI Open Access
Sarah Rennie

Genes, Journal Year: 2024, Volume and Issue: 15(5), P. 629 - 629

Published: May 15, 2024

RNA-binding proteins and chemical modifications to RNA play vital roles in the co- post-transcriptional regulation of genes. In order fully decipher their biological roles, it is an essential task catalogue precise target locations along with preferred contexts sequence-based determinants. Recently, deep learning approaches have significantly advanced this field. These methods can predict presence or absence modification at specific genomic regions based on diverse features, particularly sequence secondary structure, allowing us highly non-linear patterns structures that underlie site preferences. This article provides overview how being applied area, a particular focus problem mRNA-RBP binding, while also considering other types RNA. It discusses different model handle and/or secondary-structure-based inputs, process training, including choice negative separating sets for testing offers recommendations developing biologically relevant models. Finally, highlights four key areas are crucial advancing

Language: Английский

Deep learning model to discriminate diverse infection types based on pairwise analysis of host gene expression DOI Creative Commons

Jize Xie,

Xubin Zheng,

Jianlong Yan

et al.

iScience, Journal Year: 2024, Volume and Issue: 27(6), P. 109908 - 109908

Published: May 7, 2024

Accurate detection of pathogens, particularly distinguishing between Gram-positive and Gram-negative bacteria, could improve disease treatment. Host gene expression can capture the immune system's response to infections caused by various pathogens. Here, we present a deep neural network model, bvnGPS2, which incorporates attention mechanism based on large-scale integrated host transcriptome dataset precisely identify bacterial as well viral infections. We performed analysis 4,949 blood samples across 40 cohorts from 10 countries using our previously designed omics data integration method, iPAGE, select discriminant pairs train bvnGPS2. The performance model was evaluated six independent comprising 374 samples. Overall, shows robust capability accurately specific infections, paving way for precise medicine strategies in infection treatment potentially also identifying subtypes other diseases.

Language: Английский

Citations

6

Deciphering 3'UTR Mediated Gene Regulation Using Interpretable Deep Representation Learning DOI Creative Commons
Yuning Yang, Gen Li, Kuan Pang

et al.

Advanced Science, Journal Year: 2024, Volume and Issue: 11(39)

Published: Aug. 19, 2024

Abstract The 3' untranslated regions (3'UTRs) of messenger RNAs contain many important cis‐regulatory elements that are under functional and evolutionary constraints. It is hypothesized these constraints similar to grammars syntaxes in human languages can be modeled by advanced natural language techniques such as Transformers, which has been very effective modeling complex protein sequence structures. Here 3UTRBERT described, implements an attention‐based model, i.e., Bidirectional Encoder Representations from Transformers (BERT). pre‐trained on aggregated 3'UTR sequences mRNAs a task‐agnostic manner; the model then fine‐tuned for specific downstream tasks identifying RBP binding sites, m6A RNA modification predicting sub‐cellular localizations. Benchmark results show generally outperformed other contemporary methods each tasks. More importantly, self‐attention mechanism within allows direct visualization semantic relationship between effectively identifies with regulatory potential. expected serve foundational tool analyze various labeling fields, thus enhancing decipherability post‐transcriptional mechanisms.

Language: Английский

Citations

6

CFPLncLoc: A multi-label lncRNA subcellular localization prediction based on Chaos game representation and centralized feature pyramid DOI
Sheng Wang, Zu‐Guo Yu,

Han Guosheng

et al.

International Journal of Biological Macromolecules, Journal Year: 2025, Volume and Issue: 297, P. 139519 - 139519

Published: Jan. 5, 2025

Language: Английский

Citations

0

An ensemble deep learning framework for multi-class LncRNA subcellular localization with innovative encoding strategy DOI Creative Commons
Wenxing Hu,

Yan Yue,

Ruomei Yan

et al.

BMC Biology, Journal Year: 2025, Volume and Issue: 23(1)

Published: Feb. 21, 2025

Long non-coding RNA (LncRNA) play pivotal roles in various cellular processes, and elucidating their subcellular localization can offer crucial insights into functional significance. Accurate prediction of lncRNA is paramount importance. Despite numerous computational methods developed for this purpose, existing approaches still encounter challenges stemming from the complexity data representation difficulty capturing nucleotide distribution information within sequences. In study, we propose a novel deep learning-based model, termed MGBLncLoc, which incorporates unique multi-class encoding technique known as generalized based on Distribution Density Multi-Class Nucleotide Groups (MCD-ND). This approach enables more precise reflection distributions, distinguishing between constant discriminative regions sequences, thereby enhancing performance. Additionally, our learning model integrates advanced neural network modules, including Multi-Dconv Head Transposed Attention, Gated-Dconv Feed-forward Network, Convolutional Neural Bidirectional Gated Recurrent Unit, to comprehensively exploit sequence features lncRNA. Comparative analysis against commonly used feature models validates effectiveness demonstrating superior research offers effective solutions predicting localization, providing valuable support related biological investigations.

Language: Английский

Citations

0

TransBind allows precise detection of DNA-binding proteins and residues using language models and deep learning DOI Creative Commons
Md Toki Tahmid,

A.K.M. Mehedi Hasan,

Md. Shamsuzzoha Bayzid

et al.

Communications Biology, Journal Year: 2025, Volume and Issue: 8(1)

Published: April 5, 2025

Identifying DNA-binding proteins and their binding residues is critical for understanding diverse biological processes, but conventional experimental approaches are slow costly. Existing machine learning methods, while faster, often lack accuracy struggle with data imbalance, relying heavily on evolutionary profiles like PSSMs HMMs derived from multiple sequence alignments (MSAs). These dependencies make them unsuitable orphan or those that evolve rapidly. To address these challenges, we introduce TransBind, an alignment-free deep framework predicts directly a single primary sequence, eliminating the need MSAs. By leveraging features pre-trained protein language models, TransBind effectively handles issue of imbalance achieves superior performance. Extensive evaluations using datasets case studies demonstrate significantly outperforms state-of-the-art methods in terms both computational efficiency. available as web server at https://trans-bind-web-server-frontend.vercel.app/ .

Language: Английский

Citations

0

PAGE-based transfer learning from single-cell to bulk sequencing enhances model generalization for sepsis diagnosis DOI Creative Commons

Nana Jin,

Chuanchuan Nan,

Wanyang Li

et al.

Briefings in Bioinformatics, Journal Year: 2024, Volume and Issue: 26(1)

Published: Nov. 22, 2024

Abstract Sepsis, caused by infections, sparks a dangerous bodily response. The transcriptional expression patterns of host responses aid in the diagnosis sepsis, but challenge lies their limited generalization capabilities. To facilitate sepsis diagnosis, we present an updated version single-cell Pair-wise Analysis Gene Expression (scPAGE) using transfer learning method, scPAGE2, dedicated to data fusion between and bulk transcriptome. Compared scPAGE, upgrade scPAGE2 featured ameliorated Differentially Expressed Pairs (DEPs) for pretraining model transcriptome retrained it construct diagnostic model, which effectively transferred cell-layer information from Seven datasets across three platforms fluorescence-activated cell sorting (FACS) were used performance validation. involved four DEPs, showing robust next-generation sequencing microarray platforms, surpassing state-of-the-art models with average AUROC 0.947 AUPRC 0.987. scRNA-seq reveals higher proportions JAM3-PIK3AP1 monocytes, decreased ARG1-CCR7 B T cells. Elevated IRF6-HP monocytes confirmed both independent cohort FACS. Both superior vitro validation emphasize that is effective construction model. We additionally applied acute myeloid leukemia demonstrated its classification performance. Overall, provided strategy improve generalizability can be adapted broad range clinical prediction scenarios.

Language: Английский

Citations

3

mRNA-CLA: An interpretable deep learning approach for predicting mRNA subcellular localization DOI
Yi‐Fan Chen,

Zhenya Du,

Xuanbai Ren

et al.

Methods, Journal Year: 2024, Volume and Issue: 227, P. 17 - 26

Published: May 3, 2024

Language: Английский

Citations

1

RNALocate v3.0: Advancing the Repository of RNA Subcellular Localization with Dynamic Analysis and Prediction DOI Creative Commons

Le Wu,

Luqi Wang, Shijie Hu

et al.

Nucleic Acids Research, Journal Year: 2024, Volume and Issue: 53(D1), P. D284 - D292

Published: Oct. 15, 2024

Abstract Subcellular localization of RNA is a crucial mechanism for regulating diverse biological processes within cells. Dynamic subcellular localizations are essential maintaining cellular homeostasis; however, their distribution and changes during development differentiation remain largely unexplored. To elucidate the dynamic patterns cells, we have upgraded RNALocate to version 3.0, repository RNA-subcellular (http://www.rnalocate.org/ or http://www.rna-society.org/rnalocate/). v3.0 incorporates analyzes sequencing data from over 850 samples, with specific focus on in under various conditions. The species coverage has also been expanded encompass mammals, non-mammals, plants microbes. Additionally, provide an integrated prediction algorithm seven types across eleven compartments, utilizing convolutional neural networks (CNNs) transformer models. Overall, contains total 1 844 013 RNA-localization entries covering 26 types, 242 177 localizations. It serves as comprehensive readily accessible resource localization, facilitating elucidation function disease pathogenesis.

Language: Английский

Citations

1

Deep Learning for Elucidating Modifications to RNA—Status and Challenges Ahead DOI Open Access
Sarah Rennie

Genes, Journal Year: 2024, Volume and Issue: 15(5), P. 629 - 629

Published: May 15, 2024

RNA-binding proteins and chemical modifications to RNA play vital roles in the co- post-transcriptional regulation of genes. In order fully decipher their biological roles, it is an essential task catalogue precise target locations along with preferred contexts sequence-based determinants. Recently, deep learning approaches have significantly advanced this field. These methods can predict presence or absence modification at specific genomic regions based on diverse features, particularly sequence secondary structure, allowing us highly non-linear patterns structures that underlie site preferences. This article provides overview how being applied area, a particular focus problem mRNA-RBP binding, while also considering other types RNA. It discusses different model handle and/or secondary-structure-based inputs, process training, including choice negative separating sets for testing offers recommendations developing biologically relevant models. Finally, highlights four key areas are crucial advancing

Language: Английский

Citations

0