TransGeneSelector: A Transformer-based Approach Tailored for Key Gene Mining with Small Plant Transcriptomic Datasets DOI Creative Commons
Kerui Huang,

Jianhong Tian,

Лэй Сун

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2023, Volume and Issue: unknown

Published: Sept. 28, 2023

Abstract Gene mining, particularly from small sample sizes such as in plants, remains a challenge life sciences. Traditional methods often omit significant genes, while deep learning techniques are hindered by constraints and lack specialized gene mining approaches. This paper presents TransGeneSelector, the first method tailored for key transcriptomic datasets, ingeniously integrating data augmentation, filtering, Transformer-based classifier. Tested on Arabidopsis thaliana seeds’ germination classification using just 79 samples, it not only achieves performance par with, if superior to, Random Forest SVM but also excels identifying upstream regulatory genes that might miss, these pinpointed more accurately reflect metabolic processes inherent seed germination. TransGeneSelector’s ability to mine vital limited datasets signifies its potential current state-of-the-art scenarios, providing an efficient versatile solution this critical research area.

Language: Английский

A comprehensive survey on applications of transformers for deep learning tasks DOI
Saidul Islam, Hanae Elmekki,

Ahmed Elsebai

et al.

Expert Systems with Applications, Journal Year: 2023, Volume and Issue: 241, P. 122666 - 122666

Published: Nov. 23, 2023

Language: Английский

Citations

122

Leveraging transformers‐based language models in proteome bioinformatics DOI
Nguyen Quoc Khanh Le

PROTEOMICS, Journal Year: 2023, Volume and Issue: 23(23-24)

Published: June 29, 2023

Abstract In recent years, the rapid growth of biological data has increased interest in using bioinformatics to analyze and interpret this data. Proteomics, which studies structure, function, interactions proteins, is a crucial area bioinformatics. Using natural language processing (NLP) techniques proteomics an emerging field that combines machine learning text mining Recently, transformer‐based NLP models have gained significant attention for their ability process variable‐length input sequences parallel, self‐attention mechanisms capture long‐range dependencies. review paper, we discuss advancements proteome examine advantages, limitations, potential applications improve accuracy efficiency various tasks. Additionally, highlight challenges future directions these research. Overall, provides valuable insights into revolutionize

Language: Английский

Citations

35

PIDGN: An explainable multimodal deep learning framework for early prediction of Parkinson's disease DOI
Wenjia Li, Qiu Rao, Shuying Dong

et al.

Journal of Neuroscience Methods, Journal Year: 2025, Volume and Issue: unknown, P. 110363 - 110363

Published: Jan. 1, 2025

Language: Английский

Citations

1

CervixFormer: A Multi-scale swin transformer-Based cervical pap-Smear WSI classification framework DOI
Anwar A. Khan,

Seung-Hyeon Han,

Naveed Ilyas

et al.

Computer Methods and Programs in Biomedicine, Journal Year: 2023, Volume and Issue: 240, P. 107718 - 107718

Published: July 10, 2023

Language: Английский

Citations

18

On knowing a gene: A distributional hypothesis of gene function DOI Creative Commons
Jason J. Kwon, Joshua Pan, Guadalupe Gonzalez

et al.

Cell Systems, Journal Year: 2024, Volume and Issue: 15(6), P. 488 - 496

Published: May 28, 2024

As words can have multiple meanings that depend on sentence context, genes various functions the surrounding biological system. This pleiotropic nature of gene function is limited by ontologies, which annotate without considering contexts. We contend problem in genetics may be informed recent technological leaps natural language processing, representations word semantics automatically learned from diverse In contrast to efforts model as "is-a" relationships 1990s, modern distributional represents vectors a semantic space and fuels current advances transformer-based models such large generative pre-trained transformers. A similar shift thinking distributions over cellular contexts enable breakthrough data-driven learning datasets inform function.

Language: Английский

Citations

6

RNA Sequence Analysis Landscape: A Comprehensive Review of Task Types, Databases, Datasets, Word Embedding Methods, and Language Models DOI Creative Commons
Muhammad Nabeel Asim, Muhammad Ali Ibrahim,

Tayyaba Asif

et al.

Heliyon, Journal Year: 2025, Volume and Issue: 11(2), P. e41488 - e41488

Published: Jan. 1, 2025

Deciphering information of RNA sequences reveals their diverse roles in living organisms, including gene regulation and protein synthesis. Aberrations sequence such as dysregulation mutations can drive a spectrum diseases cancers, genetic disorders, neurodegenerative conditions. Furthermore, researchers are harnessing RNA's therapeutic potential for transforming traditional treatment paradigms into personalized therapies through the development RNA-based drugs therapies. To gain insights biological functions to detect at early stages develop potent therapeutics, performing types analysis tasks. conventional wet-lab methods is expensive, time-consuming error prone. enable large-scale analysis, empowerment experimental with Artificial Intelligence (AI) applications necessitates scientists have comprehensive knowledge both DNA AI fields. While molecular biologists encounter challenges understanding methods, computer often lack basic foundations Considering absence literature that bridges this research gap promotes AI-driven applications, contributions manuscript manifold: It equips 47 distinct sets stage benchmark datasets related tasks by facilitating cruxes 64 different databases. presents word embeddings language models across streamlines new predictors providing survey 58 70 based predictive pipelines performance values well top encoding performances

Language: Английский

Citations

0

Application of deep learning-based multimodal fusion technology in cancer diagnosis: A survey DOI
L. Yan, Liangrui Pan,

Yijun Peng

et al.

Engineering Applications of Artificial Intelligence, Journal Year: 2025, Volume and Issue: 143, P. 109972 - 109972

Published: Jan. 7, 2025

Language: Английский

Citations

0

ARGai 1.0: A GAN augmented in silico approach for identifying resistant genes and strains in E. coli using vision transformer DOI
Debasish Swapnesh Kumar Nayak,

Ruchika Das,

Santanu Sahoo

et al.

Computational Biology and Chemistry, Journal Year: 2025, Volume and Issue: 115, P. 108342 - 108342

Published: Jan. 7, 2025

Language: Английский

Citations

0

TransGeneSelector: using a transformer approach to mine key genes from small transcriptomic datasets in plant responses to various environments DOI Creative Commons
Kerui Huang,

Jianhong Tian,

Лэй Сун

et al.

BMC Genomics, Journal Year: 2025, Volume and Issue: 26(1)

Published: March 17, 2025

Gene mining is crucial for understanding the regulatory mechanisms underlying complex biological processes, particularly in plants responding to environmental conditions. Traditional machine learning methods, while useful, often overlook important gene relationships due their reliance on manual feature selection and limited ability capture inter-gene dynamics. Deep approaches, powerful, are unsuitable small sample sizes. This study introduces TransGeneSelector, first deep framework specifically designed key genes from transcriptomic datasets. By integrating a Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN-GP) generation Transformer-based network classification, TransGeneSelector efficiently addresses challenges of small-sample data, capturing both global interactions specific processes. Evaluated Arabidopsis thaliana, model achieved high classification accuracy predicting seed germination heat stress conditions, outperforming traditional methods like Random Forest Support Vector Machines (SVM). Moreover, Shapley Additive Explanations (SHAP) analysis construction revealed that effectively identified appear have upstream functions based our analyses, enriching them multiple pathways which critical response. RT-qPCR validation further confirmed model's accuracy, demonstrating consistent expression patterns across varying The findings underscore potential as robust tool mining, offering deeper insights into regulation organism adaptation under diverse work provides leverages identification

Language: Английский

Citations

0

An Interpretable Hybrid Deep Learning Model for Molten Iron Temperature Prediction at the Iron-Steel Interface Based on Bi-LSTM and Transformer DOI Creative Commons

Zhenzhong Shen,

Weigang Han,

Yanzhuo Hu

et al.

Mathematics, Journal Year: 2025, Volume and Issue: 13(6), P. 975 - 975

Published: March 15, 2025

Hot metal temperature is a key factor affecting the quality and energy consumption of iron steel smelting. Accurate prediction drop in hot ladle very important for optimizing transport, improving efficiency, reducing consumption. Most existing studies focus on molten torpedo tanks, but there significant research gap drop, especially as increasingly used to replace tank transportation process, this has not been fully addressed literature. This paper proposes an interpretable hybrid deep learning model combining Bi-LSTM Transformer solve complexity prediction. By leveraging Catboost-RFECV, most influential variables are selected, captures both local features with global dependencies Transformer. Hyperparameters optimized automatically using Optuna, enhancing performance. Furthermore, SHAP analysis provides valuable insights into factors influencing drops, enabling more accurate temperature. The experimental results demonstrate that proposed outperforms each individual ensemble terms R2, RMSE, MAE, other evaluation metrics. Additionally, identifies contributing drop.

Language: Английский

Citations

0