TargetCLP: clathrin proteins prediction combining transformed and evolutionary scale modeling-based multi-view features via weighted feature integration approach DOI Creative Commons
Matee Ullah, Shahid Akbar, Ali Raza

et al.

Briefings in Bioinformatics, Journal Year: 2024, Volume and Issue: 26(1)

Published: Nov. 22, 2024

Abstract Clathrin proteins, key elements of the vesicle coat, play a crucial role in various cellular processes, including neural function, signal transduction, and endocytosis. Disruptions clathrin protein functions have been associated with wide range diseases, such as Alzheimer’s, neurodegeneration, viral infection, cancer. Therefore, correctly identifying is critical to unravel mechanism these fatal diseases designing drug targets. This paper presents novel computational method, named TargetCLP, precisely identify proteins. TargetCLP leverages four single-view feature representation methods, two transformed sets (PSSM-CLBP RECM-CLBP), one qualitative characteristics feature, deep-learned-based embedding using ESM. The features are integrated based on their weights differential evolution, BTG selection algorithm utilized generate more optimal reduced subset. model trained classifiers, among which proposed SnBiLSTM achieved remarkable performance. Experimental comparative results both training independent datasets show that offers significant improvements terms prediction accuracy generalization unseen data, furthering advancements research field.

Language: Английский

Leveraging protein language models for robust antimicrobial peptide detection DOI
Lichao Zhang, Shuwen Xiong, Lei Xu

et al.

Methods, Journal Year: 2025, Volume and Issue: unknown

Published: March 1, 2025

Language: Английский

Citations

0

Prediction of lncRNA-miRNA interaction based on sequence and structural information of potential binding site DOI

Dan-Yang Qi,

Chengyan Wu,

Zhihong Hao

et al.

International Journal of Biological Macromolecules, Journal Year: 2025, Volume and Issue: unknown, P. 142255 - 142255

Published: March 1, 2025

Language: Английский

Citations

0

NeuroPred-AIMP: Multimodal Deep Learning for Neuropeptide Prediction via Protein Language Modeling and Temporal Convolutional Networks DOI

Jinjin Li,

Shuwen Xiong, Hua Shi

et al.

Journal of Chemical Information and Modeling, Journal Year: 2025, Volume and Issue: unknown

Published: April 21, 2025

Neuropeptides are key signaling molecules that regulate fundamental physiological processes ranging from metabolism to cognitive function. However, accurate identification is a huge challenge due sequence heterogeneity, obscured functional motifs and limited experimentally validated data. Accurate of neuropeptides critical for advancing neurological disease therapeutics peptide-based drug design. Existing neuropeptide methods rely on manual features combined with traditional machine learning methods, which difficult capture the deep patterns sequences. To address these limitations, we propose NeuroPred-AIMP (adaptive integrated multimodal predictor), an interpretable model synergizes global semantic representation protein language (ESM) multiscale structural temporal convolutional network (TCN). The introduced adaptive fusion mechanism residual enhancement dynamically recalibrate feature contributions, achieve robust integration evolutionary local information. experimental results demonstrated proposed showed excellent comprehensive performance independence test set, accuracy 92.3% AUROC 0.974. Simultaneously, good balance in ability identify positive negative samples, sensitivity 92.6% specificity 92.1%, difference less than 0.5%. result fully confirms effectiveness strategy task recognition.

Language: Английский

Citations

0

scRSSL: Residual semi‐supervised learning with deep generative models to automatically identify cell types DOI Creative Commons

Yanru Gao,

Hongyu Duan,

Fan‐hao Meng

et al.

IET Systems Biology, Journal Year: 2025, Volume and Issue: unknown

Published: April 22, 2025

Abstract Single‐cell sequencing (scRNA‐seq) allows researchers to study cellular heterogeneity in individual cells. In single‐cell transcriptomics analysis, identifying the cell type of cells is a key task. At present, datasets often face challenges high dimensionality, large number samples, sparsity and sample imbalance. The traditional methods recognition have been challenged. authors propose deep residual generation model based on semi‐supervised learning (scRSSL) address these challenges. ScRSSL creatively introduces networks into generative models. take advantage its solve problem During training model, use neural network accomplish inference types so that local features data can be extracted. Because approach, it automatically accurately predict datasets, even with only small labels. Experimentally, authors’ method has proven better performance compared other methods.

Language: Английский

Citations

0

DGCLCMI: a deep graph collaboration learning method to predict circRNA-miRNA interactions DOI Creative Commons
Chao Cao, Mengli Li, Chunyu Wang

et al.

BMC Biology, Journal Year: 2025, Volume and Issue: 23(1)

Published: April 23, 2025

Abstract Background Numerous studies have shown that circRNA can act as a miRNA sponge, competitively binding to miRNAs, thereby regulating gene expression and disease progression. Due the high cost time-consuming nature of traditional wet lab experiments, analyzing circRNA-miRNA associations is often inefficient labor-intensive. Although some computational models been developed identify these associations, they fail capture deep collaborative features between interactions do not guide training feature extraction networks based on high-order relationships, leading poor prediction performance. Results To address issues, we innovatively propose novel graph collaboration learning method for interaction, called DGCLCMI. First, it uses word2vec encode sequences into word embeddings. Next, present joint model combines an improved neural filtering with network optimization. Deep interaction information embedded informative within sequence representations prediction. Comprehensive experiments three well-established datasets across seven metrics demonstrate our algorithm significantly outperforms previous models, achieving average AUC 0.960. In addition, case study reveals 18 out 20 predicted unknown CMI data points are accurate. Conclusions The DGCLCMI improves representation by capturing information, superior performance compared prior methods. It facilitates discovery sheds light their roles in physiological processes.

Language: Английский

Citations

0

Exploring species taxonomic kingdom using information entropy and nucleotide compositional features of coding sequences based on machine learning methods DOI
Sebu Aboma Temesgen, Basharat Ahmad,

Bakanina Kissanga Grace-Mercure

et al.

Methods, Journal Year: 2025, Volume and Issue: unknown

Published: April 1, 2025

Language: Английский

Citations

0

MlyPredCSED: based on extreme point deviation compensated clustering combined with cross-scale convolutional neural networks to predict multiple lysine sites in human DOI Creative Commons
Yuhua Zuo,

Xingze Fang,

Jiankang Chen

et al.

Briefings in Bioinformatics, Journal Year: 2025, Volume and Issue: 26(2)

Published: March 1, 2025

Abstract In post-translational modification, covalent bonds on lysine and attached chemical groups significantly change proteins’ physical properties. They shape protein structures, enhance function stability, are vital for physiological processes, affecting health disease through mechanisms like gene expression, signal transduction, degradation, cell metabolism. Although (K) modification sites considered among the most common types of modifications in proteins, research K-PTMs has largely overlooked synergistic effects between different lacked techniques to address problem sample imbalance. Based this, Extreme Point Deviation Compensated Clustering (EPDCC) Undersampling algorithm was proposed this study combined with Cross-Scale Convolutional Neural Networks (CSCNNs) develop a novel computational tool, MlyPredCSED, simultaneously predicting multiple sites. MlyPredCSED employs Multi-Label Position-Specific Triad Amino Acid Propensity physicochemical properties amino acids richness sequence information. To challenge imbalance, innovative EPDCC technique introduced adjust majority class samples. The model’s training testing phase relies advanced CSCNN framework. cross-validation testing, outperformed existing models, especially complex categories This not only provides an efficient method identification but also demonstrates its value biological drug development. facilitate use by researchers, we have specifically developed accessible free web tool: http://www.mlypredcsed.com.

Language: Английский

Citations

0

Taco-DDI: accurate prediction of drug-drug interaction events using graph transformers and dynamic co-attention matrices DOI
Jianbo Qiao, Xu Guo, Junru Jin

et al.

Neural Networks, Journal Year: 2025, Volume and Issue: unknown, P. 107655 - 107655

Published: May 1, 2025

Language: Английский

Citations

0

PBertKla: a protein large language model for predicting human lysine lactylation sites DOI Creative Commons
Hongyan Lai, Dan Luo,

Mi Yang

et al.

BMC Biology, Journal Year: 2025, Volume and Issue: 23(1)

Published: April 6, 2025

Lactylation is a newly discovered type of post-translational modification, primarily occurring on lysine (K) residues both histones and non-histones to exert diverse effects target proteins. Research has shown that lactylation (Kla) modification ubiquitous in different cells participates the determination cell function fate, as well initiation progression various diseases. Precise identification Kla sites fundamental for elucidating their biological functions uncovering application potential. Here, we proposed novel human site predictor (named PBertKla) through curating reliable benchmark dataset with proper sample length sequence identity threshold train protein large language model optimal hyperparameters. Extensive experimental results consistently demonstrated our possessed robust prediction ability, achieving an AUC (area under receiver operating characteristic curve) value over 0.880 independent validation data. Feature visualization analysis further validated effectiveness feature learning representation from sequences. Moreover, benchmarked PBertKla against other cutting-edge models testing sources, highlighting its superiority transferability. All indicated excelled automatic sites, it would advance investigation modifications significance health disease.

Language: Английский

Citations

0

Improving protein-protein interaction modulator predictions via knowledge-fused language models DOI
Zitong Zhang, Quan Zou, Chunyu Wang

et al.

Information Fusion, Journal Year: 2025, Volume and Issue: unknown, P. 103227 - 103227

Published: April 1, 2025

Language: Английский

Citations

0