Interpretable Multimodal Deep Ensemble Framework Dissecting Bloodbrain Barrier Permeability with Molecular Features DOI

Dushuo Feng,

Lulu Guan,

Yunxiang Sun

и другие.

The Journal of Physical Chemistry Letters, Год журнала: 2025, Номер unknown, С. 5806 - 5819

Опубликована: Июнь 3, 2025

Blood-brain barrier permeability (BBBP) prediction plays a critical role in the drug discovery process, particularly for compounds targeting central nervous system. While machine learning (ML) has significantly advanced of BBBP, there remains an urgent need interpretable ML models that can reveal physicochemical principles governing BBB permeability. In this study, we propose multimodal framework integrates molecular fingerprints (Morgan, MACCS, RDK) and image features to improve BBBP prediction. The classification task (BBB-permeable vs nonpermeable) is addressed with stacking ensemble model combining multiple base classifiers. proposed demonstrates competitive predictive stability, generalization ability, feature interpretability compared recent approaches, under comparable evaluation settings. Beyond performance, our incorporates Principal Component Analysis (PCA) Shapley Additive Explanations (SHAP) analysis highlight key fingerprint contributing predictions. regression (logBB value prediction) tackled by multi-input deep framework, incorporating Transformer encoder processing, convolutional neural network (CNN) extraction, Multi-Head Attention fusion mechanism enhance interactions. maps derived from token-level relationships within representations. This work provides modeling enhanced transparency mechanistic insight lays foundation future studies transparent descriptors physics-informed features.

Язык: Английский

AI/ML methodologies and the future-will they be successful in designing the next generation of new chemical entities? DOI Creative Commons
Rachelle J. Bienstock

Journal of Cheminformatics, Год журнала: 2025, Номер 17(1)

Опубликована: Апрель 6, 2025

Cheminformatics and chemical databases are essential to drug discovery. However, machine learning (ML) artificial intelligence (AI) methodologies changing the way in which data is used. How will use of change discovery moving forward? do new ML methods molecular property prediction, hit lead target identification structure prediction differ compare with previous computational methods? Will improve diversity ligand design, offer enhancements. There still many advantages physics based they something lacking ML/ AI methods. Additionally, training often give best results when experimental assay measurements fed back into model. Often modeling not diametrically opposed but greatest advantage used complementary.

Язык: Английский

Процитировано

0

Evaluating Molecular Similarity Measures: Do Similarity Measures Reflect Electronic Structure Properties? DOI
Rebekah Duke, Chih-Hsuan Yang, Baskar Ganapathysubramanian

и другие.

Journal of Chemical Information and Modeling, Год журнала: 2025, Номер unknown

Опубликована: Апрель 29, 2025

The rapid adoption of big data, machine learning (ML), and generative artificial intelligence (AI) in chemical discovery has heightened the importance quantifying molecular similarity. Molecular similarity, commonly assessed as distance between fingerprints, is integral to applications such database curation, diversity analysis, property prediction. AI tools frequently rely on these similarity measures cluster molecules under assumption that structurally similar exhibit properties. However, this not universally valid, particularly for continuous properties like electronic structure Despite prevalence fingerprint-based measures, their evaluation largely depended biological activity data sets qualitative metrics, limiting relevance nonbiological domains. To address gap, we propose a framework evaluate correlation Our approach builds concept neighborhood behavior incorporates kernel density estimation (KDE) analysis quantify how well capture relationships. Using set over 350 million molecule pairs with structure, redox, optical properties, systematically several fingerprint generators, functions, Both curated are publicly available.

Язык: Английский

Процитировано

0

Machine Learning Pipeline for Molecular Property Prediction Using ChemXploreML DOI
Aravindh N. Marimuthu, Brett A. McGuire

Journal of Chemical Information and Modeling, Год журнала: 2025, Номер unknown

Опубликована: Май 20, 2025

We present ChemXploreML, a modular desktop application designed for machine learning-based molecular property prediction. The framework's flexible architecture allows integration of any embedding technique with modern learning algorithms, enabling researchers to customize their prediction pipelines without extensive programming expertise. To demonstrate the capabilities, we implement and evaluate two approaches─Mol2Vec VICGAE (Variance-Invariance-Covariance regularized GRU Auto-Encoder)─combined state-of-the-art tree-based ensemble methods (Gradient Boosting Regression, XGBoost, CatBoost, LightGBM). Using five fundamental properties as test cases─melting point, boiling vapor pressure, critical temperature (CT), pressure─we validate our framework on data set from CRC Handbook Chemistry Physics. models achieve excellent performance well-distributed properties, R2 values up 0.93 CT predictions. Notably, while Mol2Vec embeddings (300 dimensions) delivered slightly higher accuracy, (32 exhibited comparable yet offered significantly improved computational efficiency. ChemXploreML's design facilitates easy new techniques providing platform customized tasks. automates chemical preprocessing (including UMAP-based exploration space), model optimization, analysis through an intuitive interface, making sophisticated accessible maintaining extensibility advanced cheminformatics users.

Язык: Английский

Процитировано

0

Interpretable Multimodal Deep Ensemble Framework Dissecting Bloodbrain Barrier Permeability with Molecular Features DOI

Dushuo Feng,

Lulu Guan,

Yunxiang Sun

и другие.

The Journal of Physical Chemistry Letters, Год журнала: 2025, Номер unknown, С. 5806 - 5819

Опубликована: Июнь 3, 2025

Blood-brain barrier permeability (BBBP) prediction plays a critical role in the drug discovery process, particularly for compounds targeting central nervous system. While machine learning (ML) has significantly advanced of BBBP, there remains an urgent need interpretable ML models that can reveal physicochemical principles governing BBB permeability. In this study, we propose multimodal framework integrates molecular fingerprints (Morgan, MACCS, RDK) and image features to improve BBBP prediction. The classification task (BBB-permeable vs nonpermeable) is addressed with stacking ensemble model combining multiple base classifiers. proposed demonstrates competitive predictive stability, generalization ability, feature interpretability compared recent approaches, under comparable evaluation settings. Beyond performance, our incorporates Principal Component Analysis (PCA) Shapley Additive Explanations (SHAP) analysis highlight key fingerprint contributing predictions. regression (logBB value prediction) tackled by multi-input deep framework, incorporating Transformer encoder processing, convolutional neural network (CNN) extraction, Multi-Head Attention fusion mechanism enhance interactions. maps derived from token-level relationships within representations. This work provides modeling enhanced transparency mechanistic insight lays foundation future studies transparent descriptors physics-informed features.

Язык: Английский

Процитировано

0