GICL: A Cross-Modal Drug Property Prediction Framework Based on Knowledge Enhancement of Large Language Models DOI
Na Li, Jianbo Qiao, Fei Gao

et al.

Journal of Chemical Information and Modeling, Journal Year: 2025, Volume and Issue: unknown

Published: May 27, 2025

Deep learning models have demonstrated their potential in effective molecular representations critical for drug property prediction and discovery. Despite significant advancements leveraging multimodal molecule semantics, existing approaches often struggle with challenges such as low-quality data structural complexity. Large language (LLMs) excel generating high-quality due to robust characterization capabilities. In this work, we introduce GICL, a cross-modal contrastive framework that integrates LLM-derived embeddings image representations. Specifically, LLMs extract feature from the SMILES strings of molecules, which are then contrasted graphical images achieve holistic understanding features. Experimental results demonstrate GICL achieves state-of-the-art performance on ADMET task while offering interpretable insights into properties, thereby facilitating more efficient design

Language: Английский

DrugDAGT: a dual-attention graph transformer with contrastive learning improves drug-drug interaction prediction DOI Creative Commons
Yaojia Chen, Jiacheng Wang, Quan Zou

et al.

BMC Biology, Journal Year: 2024, Volume and Issue: 22(1)

Published: Oct. 14, 2024

Drug-drug interactions (DDIs) can result in unexpected pharmacological outcomes, including adverse drug events, which are crucial for discovery. Graph neural networks have substantially advanced our ability to model molecular representations; however, the precise identification of key local structures and capture long-distance structural correlations better DDI prediction interpretation remain significant challenges. Here, we present DrugDAGT, a dual-attention graph transformer framework with contrastive learning predicting multiple types. The incorporates attention mechanisms at both bond atomic levels, thereby enabling integration short long-range dependencies within molecules pinpoint essential Moreover, DrugDAGT further implements maximize similarity representations across different views discrimination structures. Experiments warm-start cold-start scenarios demonstrate that outperforms state-of-the-art baseline models, achieving superior overall performance. Furthermore, visualization learned pairs map provides interpretable insights instead black-box results. an effective tool accurately types by identifying chemical structures, offering valuable prescribing medications, guiding development. All data code be found https://github.com/codejiajia/DrugDAGT .

Language: Английский

Citations

5

NeuroPred-AIMP: Multimodal Deep Learning for Neuropeptide Prediction via Protein Language Modeling and Temporal Convolutional Networks DOI

Jinjin Li,

Shuwen Xiong, Hua Shi

et al.

Journal of Chemical Information and Modeling, Journal Year: 2025, Volume and Issue: unknown

Published: April 21, 2025

Neuropeptides are key signaling molecules that regulate fundamental physiological processes ranging from metabolism to cognitive function. However, accurate identification is a huge challenge due sequence heterogeneity, obscured functional motifs and limited experimentally validated data. Accurate of neuropeptides critical for advancing neurological disease therapeutics peptide-based drug design. Existing neuropeptide methods rely on manual features combined with traditional machine learning methods, which difficult capture the deep patterns sequences. To address these limitations, we propose NeuroPred-AIMP (adaptive integrated multimodal predictor), an interpretable model synergizes global semantic representation protein language (ESM) multiscale structural temporal convolutional network (TCN). The introduced adaptive fusion mechanism residual enhancement dynamically recalibrate feature contributions, achieve robust integration evolutionary local information. experimental results demonstrated proposed showed excellent comprehensive performance independence test set, accuracy 92.3% AUROC 0.974. Simultaneously, good balance in ability identify positive negative samples, sensitivity 92.6% specificity 92.1%, difference less than 0.5%. result fully confirms effectiveness strategy task recognition.

Language: Английский

Citations

0

MlyPredCSED: based on extreme point deviation compensated clustering combined with cross-scale convolutional neural networks to predict multiple lysine sites in human DOI Creative Commons
Yuhua Zuo,

Xingze Fang,

Jiankang Chen

et al.

Briefings in Bioinformatics, Journal Year: 2025, Volume and Issue: 26(2)

Published: March 1, 2025

Abstract In post-translational modification, covalent bonds on lysine and attached chemical groups significantly change proteins’ physical properties. They shape protein structures, enhance function stability, are vital for physiological processes, affecting health disease through mechanisms like gene expression, signal transduction, degradation, cell metabolism. Although (K) modification sites considered among the most common types of modifications in proteins, research K-PTMs has largely overlooked synergistic effects between different lacked techniques to address problem sample imbalance. Based this, Extreme Point Deviation Compensated Clustering (EPDCC) Undersampling algorithm was proposed this study combined with Cross-Scale Convolutional Neural Networks (CSCNNs) develop a novel computational tool, MlyPredCSED, simultaneously predicting multiple sites. MlyPredCSED employs Multi-Label Position-Specific Triad Amino Acid Propensity physicochemical properties amino acids richness sequence information. To challenge imbalance, innovative EPDCC technique introduced adjust majority class samples. The model’s training testing phase relies advanced CSCNN framework. cross-validation testing, outperformed existing models, especially complex categories This not only provides an efficient method identification but also demonstrates its value biological drug development. facilitate use by researchers, we have specifically developed accessible free web tool: http://www.mlypredcsed.com.

Language: Английский

Citations

0

Improving protein-protein interaction modulator predictions via knowledge-fused language models DOI
Zitong Zhang, Quan Zou, Chunyu Wang

et al.

Information Fusion, Journal Year: 2025, Volume and Issue: unknown, P. 103227 - 103227

Published: April 1, 2025

Language: Английский

Citations

0

Anticancer drug synergy prediction based on CatBoost DOI Creative Commons
Changheng Li, Na‐Na Guan, Hongyi Zhang

et al.

PeerJ Computer Science, Journal Year: 2025, Volume and Issue: 11, P. e2829 - e2829

Published: May 19, 2025

Background The research of cancer treatments has always been a hot topic in the medical field. Multi-targeted combination drugs have considered as an ideal option for treatment. Since it is not feasible to use clinical experience or high-throughput screening identify complete combinatorial space, methods such machine learning models offer possibility explore space effectively. Methods In this work, we proposed method based on CatBoost predict synergy scores anticancer drug combinations cell lines, which utilized oblivious trees and ordered boosting technique avoid overfitting bias. model was trained tested using data screened from NCI-ALMANAC dataset. were characterized with morgan fingerprints, target information, monotherapy lines described gene expression profiles. Results stratified 5-fold cross-validation, our obtained excellent results, where, receiver operating characteristic area under curve (ROC AUC) 0.9217, precision-recall (PR 0.4651, mean squared error (MSE) 0.1365, Pearson correlation coefficient 0.5335. performance significantly better than three other advanced models. Additionally, when SHapley Additive exPlanations (SHAP) interpret biological significance prediction found that features played more prominent roles line features, genes associated development, PTK2, CCND1, GNA11, important part prediction. Combining experimental study good effect can be used alternative predicting combinations.

Language: Английский

Citations

0

Taco-DDI: accurate prediction of drug-drug interaction events using graph transformers and dynamic co-attention matrices DOI
Jianbo Qiao, Xu Guo, Junru Jin

et al.

Neural Networks, Journal Year: 2025, Volume and Issue: 189, P. 107655 - 107655

Published: May 20, 2025

Language: Английский

Citations

0

GICL: A Cross-Modal Drug Property Prediction Framework Based on Knowledge Enhancement of Large Language Models DOI
Na Li, Jianbo Qiao, Fei Gao

et al.

Journal of Chemical Information and Modeling, Journal Year: 2025, Volume and Issue: unknown

Published: May 27, 2025

Deep learning models have demonstrated their potential in effective molecular representations critical for drug property prediction and discovery. Despite significant advancements leveraging multimodal molecule semantics, existing approaches often struggle with challenges such as low-quality data structural complexity. Large language (LLMs) excel generating high-quality due to robust characterization capabilities. In this work, we introduce GICL, a cross-modal contrastive framework that integrates LLM-derived embeddings image representations. Specifically, LLMs extract feature from the SMILES strings of molecules, which are then contrasted graphical images achieve holistic understanding features. Experimental results demonstrate GICL achieves state-of-the-art performance on ADMET task while offering interpretable insights into properties, thereby facilitating more efficient design

Language: Английский

Citations

0