Skew-probabilistic neural networks for learning from imbalanced data DOI
Shraddha M. Naik, Tanujit Chakraborty, Madhurima Panja

et al.

Pattern Recognition, Journal Year: 2025, Volume and Issue: unknown, P. 111677 - 111677

Published: April 1, 2025

Language: Английский

The receiver operating characteristic curve accurately assesses imbalanced datasets DOI Creative Commons
Eve Richardson, Raphael Trevizani, Jason Greenbaum

et al.

Patterns, Journal Year: 2024, Volume and Issue: 5(6), P. 100994 - 100994

Published: May 31, 2024

Many problems in biology require looking for a "needle haystack," corresponding to binary classification where there are few positives within much larger set of negatives, which is referred as class imbalance. The receiver operating characteristic (ROC) curve and the associated area under (AUC) have been reported ill-suited evaluate prediction performance on imbalanced more interest positive minority class, while precision-recall (PR) preferable. We show via simulation real case study that this misinterpretation difference between ROC PR spaces, showing robust imbalance, highly sensitive Furthermore, we imbalance cannot be easily disentangled from classifier measured PR-AUC.

Language: Английский

Citations

21

Phase prediction and experimental realisation of a new high entropy alloy using machine learning DOI Creative Commons
Swati Singh, Nirmal Kumar Katiyar, Saurav Goel

et al.

Scientific Reports, Journal Year: 2023, Volume and Issue: 13(1)

Published: March 23, 2023

Abstract Nearly ~ 10 8 types of High entropy alloys (HEAs) can be developed from about 64 elements in the periodic table. A major challenge for materials scientists and metallurgists at this stage is to predict their crystal structure and, therefore, mechanical properties reduce experimental efforts, which are energy time intensive. Through paper, we show that it possible use machine learning (ML) arena phase prediction develop novel HEAs. We tested five robust algorithms namely, K-nearest neighbours (KNN), support vector (SVM), decision tree classifier (DTC), random forest (RFC) XGBoost (XGB) vanilla form (base models) on a large dataset screened specifically data concerning HEA fabrication using melting casting manufacturing methods. This was necessary avoid discrepancy inherent with comparing HEAs obtained different synthesis routes as causes spurious effects while treating an imbalanced data—an erroneous practice observed reported literature. found (i) RFC model predictions were more reliable contrast other models (ii) synthetic augmentation not neat science specially HEAs, where cannot assure information reliably. To substantiate our claim, compared (V-RFC) original (1200 datasets) SMOTE-Tomek links augmented (ST-RFC) new datasets + 192 generated = 1392 datasets). although ST-RFC showed higher average test accuracy 92%, no significant breakthroughs observed, when testing number correct incorrect confusion matrix ROC-AUC scores individual phases. Based model, report development (Ni 25 Cu 18.75 Fe Co Al 6.25 ) exhibiting FCC proving robustness predictions.

Language: Английский

Citations

39

Quality over quantity: powering neuroimaging samples in psychiatry DOI
Carolina Makowski, Thomas E. Nichols, Anders M. Dale

et al.

Neuropsychopharmacology, Journal Year: 2024, Volume and Issue: 50(1), P. 58 - 66

Published: June 20, 2024

Language: Английский

Citations

9

An explainable predictive machine learning model of gangrenous cholecystitis based on clinical data: a retrospective single center study DOI Creative Commons

Ying Ma,

Man Luo,

Guoxin Guan

et al.

World Journal of Emergency Surgery, Journal Year: 2025, Volume and Issue: 20(1)

Published: Jan. 6, 2025

Gangrenous cholecystitis (GC) is a serious clinical condition associated with high morbidity and mortality rates. Machine learning (ML) has significant potential in addressing the diverse characteristics of real data. We aim to develop an explainable cost-effective predictive model for GC utilizing ML Shapley Additive explanation (SHAP) algorithm. This study included total 1006 patients 26 features. Through 5-fold CV, best performing integrated model, XGBoost, was identified. The interpreted using SHAP derive feature subsets WBC, NLR, D-dimer, Gallbladder width, Fibrinogen, wallness, Hypokalemia or hyponatremia, these comprised final diagnostic prediction model. developed tool at early stage. could assist doctors make quick surgical intervention decisions perform surgery on as soon possible.

Language: Английский

Citations

1

No effect of apolipoprotein E polymorphism on MRI brain activity during movie watching DOI Creative Commons
Petar Raykov, Jessica Daly, Simon E. Fisher

et al.

Brain and Neuroscience Advances, Journal Year: 2025, Volume and Issue: 9

Published: Jan. 1, 2025

Apolipoprotein E ε4 is a major genetic risk factor for Alzheimer’s disease, and some apolipoprotein carriers show disease–related neuropathology many years before cognitive changes are apparent. Therefore, studying healthy genotyped individuals offers an opportunity to investigate the earliest in brain measures that may signal presence of disease-related processes. For example, subtle functional magnetic resonance imaging connectivity, particularly within default mode network, have been described when comparing ε3 carriers. Similarly, very mild impairments episodic memory also documented Here, we use naturalistic activity (movie watching), marker encoding (transient connectivity around so-called ‘event boundaries’), potential phenotype differences associated with genotype large sample adults. Using Bayes analyses, found strong evidence against existence allelic status. did not find E-associated ran exploratory analyses examining: system segregation across whole brain, network. We conclude has little or no effect on how ongoing experiences processed The observed studies reflect early effects pathology

Language: Английский

Citations

1

Imbalanced rock burst assessment using variational autoencoder-enhanced gradient boosting algorithms and explainability DOI Creative Commons
Shan Lin,

Zenglong Liang,

Miao Dong

et al.

Underground Space, Journal Year: 2024, Volume and Issue: 17, P. 226 - 245

Published: Jan. 21, 2024

We conducted a study to evaluate the potential and robustness of gradient boosting algorithms in rock burst assessment, established variational autoencoder (VAE) address imbalance dataset, proposed multilevel explainable artificial intelligence (XAI) tailored for tree-based ensemble learning. collected 537 data from real-world records selected four critical features contributing occurrences. Initially, we employed visualization gain insight into data's structure performed correlation analysis explore distribution feature relationships. Then, set up VAE model generate samples minority class due imbalanced distribution. In conjunction with VAE, compared evaluated six state-of-the-art models, including classical logistic regression model, prediction. The results indicated that outperformed single VAE-classifier original classifier, VAE-NGBoost yielding most favorable results. Compared other resampling methods combined NGBoost datasets, such as synthetic oversampling technique (SMOTE), SMOTE-edited nearest neighbours (SMOTE-ENN), SMOTE-tomek links (SMOTE-Tomek), yielded best performance. Finally, developed XAI using sensitivity analysis, Tree Shapley Additive exPlanations (Tree SHAP), Anchor provide an in-depth exploration decision-making mechanics VAE-NGBoost, further enhancing accountability models predicting

Language: Английский

Citations

8

Harnessing machine learning to predict cytochrome P450 inhibition through molecular properties DOI

Hamza Zahid,

Hilal Tayara, Kil To Chong

et al.

Archives of Toxicology, Journal Year: 2024, Volume and Issue: 98(8), P. 2647 - 2658

Published: April 15, 2024

Language: Английский

Citations

8

Extracting interpretable signatures of whole-brain dynamics through systematic comparison DOI Creative Commons
Annie G. Bryant, Kevin Aquino, Linden Parkes

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: Jan. 12, 2024

The brain's complex distributed dynamics are typically quantified using a limited set of manually selected statistical properties, leaving the possibility that alternative dynamical properties may outperform those reported for given application. Here, we address this limitation by systematically comparing diverse, interpretable features both intra-regional activity and inter-regional functional coupling from resting-state magnetic resonance imaging (rs-fMRI) data, demonstrating our method case-control comparisons four neuropsychiatric disorders. Our findings generally support use linear time-series analysis techniques rs-fMRI analyses, while also identifying new ways to quantify informative fMRI structures. While simple representations performed surprisingly well (e.g., within single brain region), combining with improved performance, underscoring distributed, multifaceted changes in comprehensive, data-driven introduced here enables systematic identification interpretation quantitative signatures multivariate applicability beyond neuroimaging diverse scientific problems involving time-varying systems.

Language: Английский

Citations

5

Predicting treatment outcome based on resting-state functional connectivity in internalizing mental disorders: A systematic review and meta-analysis DOI Creative Commons
Charlotte Meinke, Ulrike Lueken, Henrik Walter

et al.

Neuroscience & Biobehavioral Reviews, Journal Year: 2024, Volume and Issue: 160, P. 105640 - 105640

Published: March 26, 2024

Predicting treatment outcome in internalizing mental disorders prior to initiation is pivotal for precision healthcare. In this regard, resting-state functional connectivity (rs-FC) and machine learning have often shown promising prediction accuracies. This systematic review meta-analysis evaluates these studies, considering their risk of bias through the Prediction Model Study Risk Bias Assessment Tool (PROBAST). We examined predictive performance features derived from rs-FC, identified with highest value, assessed employed pipelines. searched electronic databases Scopus, PubMed PsycINFO on 12th December 2022, which resulted 13 included studies. The mean balanced accuracy predicting was 77% (95% CI: [72%- 83%]). rs-FC dorsolateral prefrontal cortex had high value most However, a all compromising interpretability. Methodological recommendations are provided based comprehensive exploration studies' pipelines, potential fruitful developments discussed.

Language: Английский

Citations

4

DeepRA: A novel deep learning-read-across framework and its application in non-sugar sweeteners mutagenicity prediction DOI
Tarapong Srisongkram

Computers in Biology and Medicine, Journal Year: 2024, Volume and Issue: 178, P. 108731 - 108731

Published: June 12, 2024

Language: Английский

Citations

4