Methodological and reporting quality of machine learning studies on cancer diagnosis, treatment, and prognosis DOI Creative Commons
Aref Smiley, David Villarreal‐Zegarra, C. Mahony Reátegui-Rivera

и другие.

Frontiers in Oncology, Год журнала: 2025, Номер 15

Опубликована: Апрель 14, 2025

This study aimed to evaluate the quality and transparency of reporting in studies using machine learning (ML) oncology, focusing on adherence Consolidated Reporting Guidelines for Prognostic Diagnostic Machine Learning Models (CREMLS), TRIPOD-AI (Transparent a Multivariable Prediction Model Individual Prognosis or Diagnosis), PROBAST (Prediction Risk Bias Assessment Tool). The literature search included primary published between February 1, 2024, January 31, 2025, that developed tested ML models cancer diagnosis, treatment, prognosis. To reflect current state rapidly evolving landscape applications fifteen most recent articles each category were selected evaluation. Two independent reviewers screened extracted data characteristics, (CREMLS TRIPOD+AI), risk bias (PROBAST), performance metrics. frequently studied types breast (n=7/45; 15.6%), lung liver (n=5/45; 11.1%). findings indicate several deficiencies quality, as assessed by CREMLS TRIPOD+AI. These primarily relate sample size calculation, strategies handling outliers, documentation model predictors, access training validation data, heterogeneity. methodological assessment revealed 89% exhibited low overall bias, all have shown terms applicability. Regarding specific AI identified best-performing, Random Forest (RF) XGBoost reported, used 17.8% (n = 8). Additionally, our outlines areas where is deficient, providing researchers with guidance improve these sections and, consequently, reduce their studies.

Язык: Английский

Exploring Artificial Intelligence Biases in Predictive Models for Cancer Diagnosis DOI Open Access
Aref Smiley, C. Mahony Reátegui-Rivera, David Villarreal‐Zegarra

и другие.

Cancers, Год журнала: 2025, Номер 17(3), С. 407 - 407

Опубликована: Янв. 26, 2025

The American Society of Clinical Oncology (ASCO) has released the principles for responsible use artificial intelligence (AI) in oncology emphasizing fairness, accountability, oversight, equity, and transparency. However, extent to which these are followed is unknown. goal this study was assess presence biases quality studies on AI models according ASCO examine their potential impact through citation analysis subsequent research applications. A review original articles centered evaluation predictive cancer diagnosis published journal dedicated informatics data science clinical conducted. Seventeen bias criteria were used evaluate sources studies, aligned with ASCO’s oncology. CREMLS checklist applied quality, focusing reporting standards, performance metrics along counts included analyzed. Nine included. most common environmental life-course bias, contextual provider expertise implicit bias. Among principles, least adhered transparency, oversight privacy, human-centered application. Only 22% provided access data. revealed deficiencies methodology reporting. Most reported within moderate high ranges. Additionally, two replicated research. In conclusion, exhibited various types deficiencies, failure adhere oncology, limiting applicability reproducibility. Greater accessibility, compliance international guidelines recommended improve reliability AI-based

Язык: Английский

Процитировано

4

Development and multi-center cross-setting validation of an explainable prediction model for sarcopenic obesity: a machine learning approach based on readily available clinical features DOI Creative Commons

Rongna Lian,

Huiyu Tang,

Zecong Chen

и другие.

Aging Clinical and Experimental Research, Год журнала: 2025, Номер 37(1)

Опубликована: Март 1, 2025

Abstract Objectives Sarcopenic obesity (SO), characterized by the coexistence of and sarcopenia, is an increasingly prevalent condition in aging populations, associated with numerous adverse health outcomes. We aimed to identify validate explainable prediction model SO using easily available clinical characteristics. Setting participants A preliminary cohort 1,431 from three community regions Ziyang city, China, was used for development internal validation. For external validation, we utilized data 832 residents multi-center nursing homes. Measurements The diagnosis based on European Society Clinical Nutrition Metabolism (ESPEN) Association Study Obesity (EASO) criteria. Five machine learning models (support vector machine, logistic regression, random forest, light gradient boosting extreme boosting) were predict SO. performance these assessed area under receiver operating characteristic curve (AUC). SHapley Additive exPlanations (SHAP) approach interpretation. Results After feature reduction, 8-feature demonstrated good predictive ability. Among five tested, support (SVM) performed best both (AUC = 0.862) 0.785) validation sets. eight key predictors identified BMI, gender, neck circumference, waist thigh time full tandem standing, five-times sit-to-stand, age. SHAP analysis revealed BMI gender as most influential predictors. To facilitate utilization SVM setting, developed a web application ( https://svcpredictapp.streamlit.app/ ). Conclusions populations. This offers novel, accessible, interpretable potential enhance early detection intervention strategies. Further studies are warranted our diverse populations evaluate its impact patient outcomes when integrated into comprehensive geriatric assessments.

Язык: Английский

Процитировано

0

Limitations of Binary Classification for Long-Horizon Diagnosis Prediction and Advantages of a Discrete-Time Time-to-Event Approach: Empirical Analysis DOI Creative Commons
De Rong Loh, Elliot D. Hill, Nan Liu

и другие.

JMIR AI, Год журнала: 2025, Номер 4, С. e62985 - e62985

Опубликована: Март 27, 2025

Abstract Background A major challenge in using electronic health records (EHR) is the inconsistency of patient follow-up, resulting right-censored outcomes. This becomes particularly problematic long-horizon event predictions, such as autism and attention-deficit/hyperactivity disorder (ADHD) diagnoses, where a significant number patients are lost to follow-up before outcome can be observed. Consequently, fully supervised methods binary classification (BC), which trained predict observed substantially affected by probability sufficient leading biased results. Objective empirical analysis aims characterize BC’s inherent limitations for diagnosis prediction from EHR; quantify benefits specific time-to-event (TTE) approach, discrete-time neural network (DTNN). Methods Records within Duke University Health System EHR were analyzed, extracting features ICD-10 ( International Classification Diseases, Tenth Revision ) codes, medications, laboratories, procedures. We compared DTNN 3 BC approaches deep Cox proportional hazards model across 4 clinical conditions examine distributional patterns various subgroups. Time-varying area under receiving operating characteristic curve (AUC t time-varying average precision (AP our primary evaluation metrics. Results TTE models consistently had comparable or higher AUC AP than all conditions. At clinically relevant time points, (AUC) values YOB≤2020 (year-of-birth) DCPH (deep hazard) 0.70 (95% CI 0.66‐0.77) 0.72 0.66‐0.78) at =5 autism, 0.65‐0.76) 0.68 0.62‐0.74) =7 ADHD, 0.70‐0.75) 0.71 0.69‐0.74) =1 recurrent otitis media, 0.74 0.68‐0.82) 0.63‐0.77) food allergy, 0.6 0.55‐0.66), 0.47 0.40‐0.54), 0.73 0.70‐0.75), 0.77 0.71‐0.82) , respectively. The probabilities predicted positively correlated with censoring times, ADHD prediction. Filtering strategies based on YOB length only partially corrected these biases. In subgroup analyses, that accurately reflect actual prevalence temporal trends. Conclusions underpredicted likelihood inappropriately assigned lower scores individuals earlier censoring. Common filtering did not adequately address this limitation. approaches, DTNN, effectively mitigated bias distribution, superior discrimination calibration performance more accurate prevalence. Machine learning practitioners should recognize adopt approaches. particular well-suited mitigate effects right-censoring maximize setting.

Язык: Английский

Процитировано

0

Methodological and reporting quality of machine learning studies on cancer diagnosis, treatment, and prognosis DOI Creative Commons
Aref Smiley, David Villarreal‐Zegarra, C. Mahony Reátegui-Rivera

и другие.

Frontiers in Oncology, Год журнала: 2025, Номер 15

Опубликована: Апрель 14, 2025

This study aimed to evaluate the quality and transparency of reporting in studies using machine learning (ML) oncology, focusing on adherence Consolidated Reporting Guidelines for Prognostic Diagnostic Machine Learning Models (CREMLS), TRIPOD-AI (Transparent a Multivariable Prediction Model Individual Prognosis or Diagnosis), PROBAST (Prediction Risk Bias Assessment Tool). The literature search included primary published between February 1, 2024, January 31, 2025, that developed tested ML models cancer diagnosis, treatment, prognosis. To reflect current state rapidly evolving landscape applications fifteen most recent articles each category were selected evaluation. Two independent reviewers screened extracted data characteristics, (CREMLS TRIPOD+AI), risk bias (PROBAST), performance metrics. frequently studied types breast (n=7/45; 15.6%), lung liver (n=5/45; 11.1%). findings indicate several deficiencies quality, as assessed by CREMLS TRIPOD+AI. These primarily relate sample size calculation, strategies handling outliers, documentation model predictors, access training validation data, heterogeneity. methodological assessment revealed 89% exhibited low overall bias, all have shown terms applicability. Regarding specific AI identified best-performing, Random Forest (RF) XGBoost reported, used 17.8% (n = 8). Additionally, our outlines areas where is deficient, providing researchers with guidance improve these sections and, consequently, reduce their studies.

Язык: Английский

Процитировано

0