Development of method using language processing techniques for extracting information on drug–health food product interactions DOI
Mari Yoshizaki,

Yuki Kuriya,

Masaki Yamamoto

et al.

British Journal of Clinical Pharmacology, Journal Year: 2024, Volume and Issue: 90(6), P. 1514 - 1524

Published: March 20, 2024

Health food products (HFPs) are foods and related to maintaining promoting health. HFPs may sometimes cause unforeseen adverse health effects by interacting with drugs. Considering the importance of information on interactions between drugs, this study aimed establish a workflow extract Drug-HFP Interactions (DHIs) from open resources.

Language: Английский

Advanced machine learning framework for enhancing breast cancer diagnostics through transcriptomic profiling DOI Creative Commons

Mohamed J. Saadh,

Hanan Hassan Ahmed,

Radhwan Abdul Kareem

et al.

Discover Oncology, Journal Year: 2025, Volume and Issue: 16(1)

Published: March 17, 2025

This study proposes an advanced machine learning (ML) framework for breast cancer diagnostics by integrating transcriptomic profiling with optimized feature selection and classification techniques. A dataset of 1759 samples (987 patients, 772 healthy controls) was analyzed using Recursive Feature Elimination, Boruta, ElasticNet selection. Dimensionality reduction techniques, including Non-Negative Matrix Factorization (NMF), Autoencoders, transformer-based embeddings (BioBERT, DNABERT), were applied to enhance model interpretability. Classifiers such as XGBoost, LightGBM, ensemble voting, Multi-Layer Perceptron, Stacking trained grid search cross-validation. Model evaluation conducted accuracy, AUC, MCC, Kappa Score, ROC, PR curves, external validation performed on independent 175 samples. XGBoost LightGBM achieved the highest test accuracies (0.91 0.90) AUC values (up 0.92), particularly NMF BioBERT. The Voting method exhibited best accuracy (0.92), confirming its robustness. Transformer-based techniques significantly improved performance compared conventional approaches like PCA Decision Trees. proposed ML enhances diagnostic interpretability, demonstrating strong generalizability dataset. These findings highlight potential precision oncology personalized diagnostics.

Language: Английский

Citations

0

Robust enzyme discovery and engineering with deep learning using CataPro DOI Creative Commons
Zechen Wang, Dongqi Xie, Di Wu

et al.

Nature Communications, Journal Year: 2025, Volume and Issue: 16(1)

Published: March 20, 2025

Abstract Accurate prediction of enzyme kinetic parameters is crucial for exploration and modification. Existing models face the problem either low accuracy or poor generalization ability due to overfitting. In this work, we first developed unbiased datasets evaluate actual performance these methods proposed a deep learning model, CataPro, based on pre-trained molecular fingerprints predict turnover number ( k c t ), Michaelis constant K m catalytic efficiency / ). Compared with previous baseline models, CataPro demonstrates clearly enhanced datasets. representational mining project, by combining traditional methods, identified an (SsCSO) 19.53 times increased activity compared initial (CSO2) then successfully engineered it improve its 3.34 times. This reveals high potential as effective tool future discovery

Language: Английский

Citations

0

Transformer-based deep learning enables improved B-cell epitope prediction in parasitic pathogens: A proof-of-concept study on Fasciola hepatica DOI Creative Commons
Rui-Si Hu,

Kui Gu,

Muhammad Ehsan

et al.

PLoS neglected tropical diseases, Journal Year: 2025, Volume and Issue: 19(4), P. e0012985 - e0012985

Published: April 29, 2025

Background The identification of B-cell epitopes (BCEs) is fundamental to advancing epitope-based vaccine design, therapeutic antibody development, and diagnostics, such as in neglected tropical diseases caused by parasitic pathogens. However, the structural complexity parasite antigens high cost experimental validation present certain challenges. Advances Artificial Intelligence (AI)-driven protein engineering, particularly through machine learning deep learning, offer efficient solutions enhance prediction accuracy reduce costs. Methodology/Principal findings Here, we deepBCE-Parasite, a Transformer-based model designed predict linear BCEs from peptide sequences. By leveraging state-of-the-art self-attention mechanism, achieved remarkable predictive performance, achieving an approximately 81% AUC 0.90 both 10-fold cross-validation independent testing. Comparative analyses against 12 handcrafted features four conventional algorithms (GNB, SVM, RF, LGBM) highlighted superior power model. As case study, deepBCE-Parasite predicted eight leucine aminopeptidase (LAP) Fasciola hepatica proteomic data. Dot-blot immunoassays confirmed specific binding seven synthetic peptides positive sera, validating their IgG reactivity demonstrating model’s efficacy BCE prediction. Conclusions/Significance demonstrates excellent performance predicting across diverse pathogens, offering valuable tool for design vaccines, antibodies, diagnostic applications parasitology.

Language: Английский

Citations

0

Bioinfo-Bench: A Simple Benchmark Framework for LLM Bioinformatics Skills Evaluation DOI Creative Commons
Qiyuan Chen, Cheng Deng

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2023, Volume and Issue: unknown

Published: Oct. 21, 2023

A bstract Large Language Models (LLMs) have garnered significant recognition in the life sciences for their capacity to comprehend and utilize knowledge. The contemporary expectation diverse industries extends beyond employing LLMs merely as chatbots; instead, there is a growing emphasis on harnessing potential adept analysts proficient dissecting intricate issues within these sectors. realm of bioinformatics no exception this trend. In paper, we introduce B ioinfo -B ench , novel yet straightforward benchmark framework suite crafted assess academic knowledge data mining capabilities foundational models bioinformatics. systematically gathered from three distinct perspectives: acquisition, analysis, application, facilitating comprehensive examination LLMs. Our evaluation encompassed prominent ChatGPT, Llama, Galactica. findings revealed that excel drawing heavily upon training retention. However, proficiency addressing practical professional queries conducting nuanced inference remains constrained. Given insights, are poised delve deeper into domain, engaging further extensive research discourse. It pertinent note project currently progress, all associated materials will be made publicly accessible. 1

Language: Английский

Citations

10

Drug Discovery in the Age of Artificial Intelligence: Transformative Target-Based Approaches DOI Open Access
Akshata Y. Patne,

Sai Madhav Dhulipala,

William F. Lawless

et al.

International Journal of Molecular Sciences, Journal Year: 2024, Volume and Issue: 25(22), P. 12233 - 12233

Published: Nov. 14, 2024

The complexities inherent in drug development are multi-faceted and often hamper accuracy, speed efficiency, thereby limiting success. This review explores how recent developments machine learning (ML) significantly impacting target-based discovery, particularly small-molecule approaches. Simplified Molecular Input Line Entry System (SMILES), which translates a chemical compound's three-dimensional structure into string of symbols, is now widely used design, mining, repurposing. Utilizing ML natural language processing techniques, SMILES has revolutionized lead identification, high-throughput screening virtual screening. models enhance the accuracy predicting binding affinity selectivity, reducing need for extensive experimental Additionally, deep learning, with its strengths analyzing spatial sequential data through convolutional neural networks (CNNs) recurrent (RNNs), shows promise screening, target de novo design. Fragment-based approaches also benefit from algorithms techniques like generative adversarial (GANs), predict fragment properties affinities, aiding hit selection design optimization. Structure-based relies on high-resolution protein structures, leverages accurate predictions interactions. While challenges such as interpretability quality remain, ML's transformative impact accelerates increasing efficiency innovation. Its potential to deliver new improved treatments various diseases significant.

Language: Английский

Citations

3

VF-Pred: Predicting virulence factor using sequence alignment percentage and ensemble learning models DOI

Shreya Singh,

Nguyen Quoc Khanh Le, Cheng Wang

et al.

Computers in Biology and Medicine, Journal Year: 2023, Volume and Issue: 168, P. 107662 - 107662

Published: Nov. 3, 2023

Language: Английский

Citations

7

Intercellular pathways of cancer treatment-related cardiotoxicity and their therapeutic implications: the paradigm of radiotherapy DOI
Stella Logotheti, Athanasia Pavlopoulou, Hamid Khoshfekr Rudsari

et al.

Pharmacology & Therapeutics, Journal Year: 2024, Volume and Issue: 260, P. 108670 - 108670

Published: May 31, 2024

Language: Английский

Citations

2

DeepAT: A Deep Learning Wheat Phenotype Prediction Model Based on Genotype Data DOI Creative Commons
Jinchen Li,

Zikang He,

Guomin Zhou

et al.

Agronomy, Journal Year: 2024, Volume and Issue: 14(12), P. 2756 - 2756

Published: Nov. 21, 2024

Genomic selection serves as an effective way for crop genetic breeding, capable of significantly shortening the breeding cycle and improving accuracy breeding. Phenotype prediction can help identify variants associated with specific phenotypes. This provides a data-driven criterion genomic selection, making process more efficient targeted. Deep learning has become important tool phenotype due to its abilities in automatic feature learning, nonlinear modeling, high-dimensional data processing. Current deep models have improvements various aspects, such predictive performance computation time, but they still limitations capturing complex relationships between genotype phenotype, indicating that there is room improvement prediction. study innovatively proposes new method called DeepAT, which mainly includes input layer, extraction relationship capture output layer. predict wheat yield based on innovations following four aspects: (1) The layer DeepAT extract representative vectors from SNP data. By introducing ReLU activation function, it enhances model’s ability express features accelerates convergence speed; (2) handle while retaining much useful information possible; (3) effectively captures low-dimensional through self-attention mechanism; (4) Compared traditional RNN structures, model training stable. Using public dataset AGT, comparative experiments three machine six methods found exhibited better than other methods, achieving 99.98%, mean squared error (MSE) only 28.93 tones, Pearson correlation coefficient close 1, predicted values closely matching observed values. perspective learning-assisted great potential smart

Language: Английский

Citations

2

DeepPTM: Protein Post-translational Modification Prediction from Protein Sequences by Combining Deep Protein Language Model with Vision Transformers DOI
Necla Nisa Soylu, Emre Sefer

Current Bioinformatics, Journal Year: 2024, Volume and Issue: 19(9), P. 810 - 824

Published: Feb. 2, 2024

Introduction: More recent self-supervised deep language models, such as Bidirectional Encoder Representations from Transformers (BERT), have performed the best on some tasks by contextualizing word embeddings for a better dynamic representation. Their proteinspecific versions, ProtBERT, generated protein sequence embeddings, which resulted in performance several bioinformatics tasks. Besides, number of different post-translational modifications are prominent cellular development and differentiation. The current biological experiments can detect these modifications, but within longer duration with significant cost. Methods: In this paper, to comprehend accompanying processes concisely more rapidly, we propose DEEPPTM predict modification (PTM) sites sequences efficiently. Different than methods, enhances prediction integrating specialized ProtBERT-based attention-based vision transformers (ViT), reveals associations between types content. Additionally, it infer over species. Results: Human mouse ROC AUCs predicting Succinylation were 0.793 0.661 respectively, once 10-fold cross-validation is applied. Similarly, obtained 0.776, 0.764, 0.734 AUC scores inferring ubiquitination, crotonylation, glycation sites, respectively. According detailed computational experiments, lessens time spent laboratory while outperforming competing methods well baselines all 4 sites. our case, learning look favorable ProtBERT features traditional machine techniques. Conclusion: protein-specific model effective original BERT PTM Our code datasets be found at https://github.com/seferlab/deepptm.

Language: Английский

Citations

1

Assessing parameter efficient methods for pre-trained language model in annotating scRNA-seq data DOI

Yucheng Xia,

Yuhang Liu, Tianhao Li

et al.

Methods, Journal Year: 2024, Volume and Issue: 228, P. 12 - 21

Published: May 15, 2024

Language: Английский

Citations

1