Prediction of viral oncoproteins through the combination of generative adversarial networks and machine learning techniques DOI Creative Commons
Jorge F. Beltrán,

Lisandra Herrera-Belén,

Alejandro J. Yáñez

et al.

Scientific Reports, Journal Year: 2024, Volume and Issue: 14(1)

Published: Nov. 7, 2024

Viral oncoproteins play crucial roles in transforming normal cells into cancer cells, representing a significant factor the etiology of various cancers. Traditionally, identifying these is both time-consuming and costly. With advancements computational biology, bioinformatics tools based on machine learning have emerged as effective methods for predicting biological activities. Here, first time, we propose an innovative approach that combines Generative Adversarial Networks (GANs) with supervised to enhance accuracy generalizability viral oncoprotein prediction. Our methodology evaluated multiple models, including Random Forest, Multilayer Perceptron, Light Gradient Boosting Machine, eXtreme Boosting, Support Vector Machine. In ten-fold cross-validation our training dataset, GAN-enhanced Forest model demonstrated superior performance metrics: 0.976 accuracy, F1 score, 0.977 precision, sensitivity, 1.0 AUC. During independent testing, this achieved 0.982 These results establish new tool, VirOncoTarget, accessible via web application. We anticipate VirOncoTarget will be valuable resource researchers, enabling rapid reliable prediction advancing understanding their role biology.

Language: Английский

pACP-HybDeep: predicting anticancer peptides using binary tree growth based transformer and structural feature encoding with deep-hybrid learning DOI Creative Commons
Muhammad Khalil Shahid, Maqsood Hayat, Wajdi Alghamdi

et al.

Scientific Reports, Journal Year: 2025, Volume and Issue: 15(1)

Published: Jan. 2, 2025

Worldwide, Cancer remains a significant health concern due to its high mortality rates. Despite numerous traditional therapies and wet-laboratory methods for treating cancer-affected cells, these approaches often face limitations, including costs substantial side effects. Recently the selectivity of peptides has garnered attention from scientists their reliable targeted actions minimal adverse Furthermore, keeping outcomes existing computational models, we propose highly effective model namely, pACP-HybDeep accurate prediction anticancer peptides. In this model, training are numerically encoded using an attention-based ProtBERT-BFD encoder extract semantic features along with CTDT-based structural information. k-nearest neighbor-based binary tree growth (BTG) algorithm is employed select optimal feature set multi-perspective vector. The selected vector subsequently trained CNN + RNN-based deep learning model. Our proposed demonstrated accuracy 95.33%, AUC 0.97. To validate generalization capabilities our achieved accuracies 94.92%, 92.26%, 91.16% on independent datasets Ind-S1, Ind-S2, Ind-S3, respectively. efficacy, reliability test establish it as valuable tool researchers in academia pharmaceutical drug design.

Language: Английский

Citations

3

XGBoost-enhanced ensemble model using discriminative hybrid features for the prediction of sumoylation sites DOI Creative Commons
Salman Khan, Sumaiya Noor,

Tahir Javed

et al.

BioData Mining, Journal Year: 2025, Volume and Issue: 18(1)

Published: Feb. 3, 2025

Language: Английский

Citations

2

Addressing imbalanced data classification with Cluster-Based Reduced Noise SMOTE DOI Creative Commons

Javad Hemmatian,

Rassoul Hajizadeh, Fakhroddin Nazari

et al.

PLoS ONE, Journal Year: 2025, Volume and Issue: 20(2), P. e0317396 - e0317396

Published: Feb. 10, 2025

In recent years, the challenge of imbalanced data has become increasingly prominent in machine learning, affecting performance classification algorithms. This study proposes a novel data-level oversampling method called Cluster-Based Reduced Noise SMOTE (CRN-SMOTE) to address this issue. CRN-SMOTE combines for minority classes with cluster-based noise reduction technique. approach, it is crucial that samples from each category form one or two clusters, feature conventional methods do not achieve. The proposed evaluated on four datasets (ILPD, QSAR, Blood, and Maternal Health Risk) using five metrics: Cohen’s kappa, Matthew’s correlation coefficient (MCC), F1-score, precision, recall. Results demonstrate consistently outperformed state-of-the-art (RN-SMOTE), SMOTE-Tomek Link, SMOTE-ENN across all datasets, particularly notable improvements observed QSAR Risk indicating its effectiveness enhancing performance. Overall, experimental findings indicate RN-SMOTE 100% cases, achieving average 6.6% Kappa, 4.01% MCC, 1.87% 1.7% 2.05% recall, setting SMOTE’s neighbors’ number 5.

Language: Английский

Citations

1

Explainable AI-driven prediction of APE1 inhibitors: enhancing cancer therapy with machine learning models and feature importance analysis DOI

Aga Basit Iqbal,

Tariq Masoodi, Ajaz A. Bhat

et al.

Molecular Diversity, Journal Year: 2025, Volume and Issue: unknown

Published: Feb. 21, 2025

Language: Английский

Citations

1

Stack-AVP: a stacked ensemble predictor based on multi-view information for fast and accurate discovery of antiviral peptides DOI
Phasit Charoenkwan, Pramote Chumnanpuen, Nalini Schaduangrat

et al.

Journal of Molecular Biology, Journal Year: 2024, Volume and Issue: unknown, P. 168853 - 168853

Published: Nov. 1, 2024

Language: Английский

Citations

5

DeepAIPs-Pred: Predicting Anti-Inflammatory Peptides Using Local Evolutionary Transformation Images and Structural Embedding-Based Optimal Descriptors with Self-Normalized BiTCNs DOI
Shahid Akbar, Matee Ullah, Ali Raza

et al.

Journal of Chemical Information and Modeling, Journal Year: 2024, Volume and Issue: unknown

Published: Dec. 3, 2024

Inflammation is a biological response to harmful stimuli, playing crucial role in facilitating tissue repair by eradicating pathogenic microorganisms. However, when inflammation becomes chronic, it leads numerous serious disorders, particularly autoimmune diseases. Anti-inflammatory peptides (AIPs) have emerged as promising therapeutic agents due their high specificity, potency, and low toxicity. identifying AIPs using traditional vivo methods time-consuming expensive. Recent advancements computational-based intelligent models for offered cost-effective alternative various inflammatory diseases, owing selectivity toward targeted cells with side effects. In this paper, we propose novel computational model, namely, DeepAIPs-Pred, the accurate prediction of AIP sequences. The training samples are represented LBP-PSSM- LBP-SMR-based evolutionary image transformation methods. Additionally, capture contextual semantic features, employed attention-based ProtBERT-BFD embedding QLC structural features. Furthermore, differential evolution (DE)-based weighted feature integration utilized produce multiview vector. SMOTE-Tomek Links introduced address class imbalance problem, two-layer selection technique proposed reduce select optimal Finally, self-normalized bidirectional temporal convolutional networks (SnBiTCN) trained achieving significant predictive accuracy 94.92% an AUC 0.97. generalization our model validated two independent datasets, demonstrating higher performance improvement ∼2 ∼10% accuracies than existing state-of-the-art Ind-I Ind-II, respectively. efficacy reliability DeepAIPs-Pred highlight its potential valuable tool drug development research academia.

Language: Английский

Citations

5

SurvBeNIM: The Beran-Based Neural Importance Model for Explaining Survival Models DOI Creative Commons
Lev V. Utkin,

Danila Y. Eremenko,

Andrei V. Konstantinov

et al.

IEEE Access, Journal Year: 2025, Volume and Issue: 13, P. 24137 - 24157

Published: Jan. 1, 2025

Language: Английский

Citations

0

An efficient interpretable framework for unsupervised low, very low and extreme birth weight detection DOI Creative Commons
Ali Nawaz,

Amir Ahmad,

Shehroz S. Khan

et al.

PLoS ONE, Journal Year: 2025, Volume and Issue: 20(1), P. e0317843 - e0317843

Published: Jan. 30, 2025

Detecting low birth weight is crucial for early identification of at-risk pregnancies which are associated with significant neonatal and maternal morbidity mortality risks. This study presents an efficient interpretable framework unsupervised detection low, very extreme weights. While traditional approaches to managing class imbalance require labeled data, our explores the use learning detect anomalies indicative scenarios. method particularly valuable in contexts where data scarce or labels anomaly not available, allowing preliminary insights that can inform further labeling more focused supervised efforts. We employed fourteen different algorithms evaluated their performance using Area Under Receiver Operating Characteristics (AUCROC) Precision-Recall Curve (AUCPR) metrics. Our experiments demonstrated One Class Support Vector Machine (OCSVM) Empirical-Cumulative-distribution-based Outlier Detection (ECOD) effectively identified across categories. The OCSVM attained AUCROC 0.72 AUCPR 0.0253 LBW detection, while ECOD model showed competitive 0.045 cases. Additionally, a novel feature perturbation technique was introduced enhance interpretability models by providing into relative importance various prenatal features. proposed interpretation methodology validated clinician experts reveals promise intervention strategies improved care.

Language: Английский

Citations

0

Early warning strategies for corporate operational risk: A study by an improved random forest algorithm using FCM clustering DOI Creative Commons

X. Fang

PLoS ONE, Journal Year: 2025, Volume and Issue: 20(3), P. e0318491 - e0318491

Published: March 11, 2025

To enhance the accuracy and response speed of risk early warning system, this study develops a novel system that combines Fuzzy C-Means (FCM) clustering algorithm Random Forest (RF) model. Firstly, based on operational theory, market risk, research development financial human resource are selected as primary indicators for enterprise assessment. Secondly, Criteria Importance Through Intercriteria Correlation (CRITIC) weight method is employed to determine importance these indicators, thereby enhancing model's prediction ability stability. Following this, FCM utilized pre-processing sample data improve efficiency classification. Finally, an improved RF model constructed by optimizing parameters algorithm. The mainly from RESSET/DB, covering issuance, trading, rating fixed-income products such bonds, government corporate provides basic information, net value, position, performance funds. experimental results show achieves F1 score 87.26%, 87.95%, Area under Curve (AUC) 91.20%, precision 89.29%, recall 87.48%. They respectively 6.45%, 4.45%, 5.09%, 4.81%, 3.83% higher than traditional In study, successfully constructed, models their handle complex significantly improved.

Language: Английский

Citations

0

Sequential recommendation via agent-based irrelevancy skipping DOI

Yu Cheng,

Jiawei Zheng,

B.H. Wu

et al.

Neural Networks, Journal Year: 2025, Volume and Issue: 185, P. 107134 - 107134

Published: Jan. 9, 2025

Language: Английский

Citations

0