Management of Burns: Multi-Center Assessment Comparing AI Models and Experienced Plastic Surgeons DOI Open Access
Gianluca Marcaccini, Ishith Seth, Bryan Lim

и другие.

Journal of Clinical Medicine, Год журнала: 2025, Номер 14(9), С. 3078 - 3078

Опубликована: Апрель 29, 2025

Background: Burn injuries require accurate assessment for effective management, and artificial intelligence (AI) is gaining attention in burn care diagnosis, treatment planning, decision support. This study compares the effectiveness of AI-driven models with experienced plastic surgeons management. Methods: Ten anonymized images varying severity anatomical location were selected from publicly available databases. Three AI systems (ChatGPT-4o, Claude, Kimi AI) analyzed these images, generating clinical descriptions management plans. reviewed same to establish a reference standard evaluated AI-generated recommendations using five-point Likert scale accuracy, relevance, appropriateness. Statistical analyses, including Cohen’s kappa coefficient, assessed inter-rater reliability comparative accuracy. Results: showed high diagnostic agreement clinicians, ChatGPT-4o achieving highest ratings. However, varied specificity, occasionally lacking individualized considerations. Readability scores indicated that outputs more comprehensible than traditional medical literature, though some overly simplistic. coefficient suggested moderate among human evaluators. Conclusions: While demonstrate strong accuracy readability, further refinements are needed improve specificity personalization. highlights AI’s potential as supplementary tool while emphasizing need oversight ensure safe patient care.

Язык: Английский

Leveraging Artificial Intelligence for Personalized Rehabilitation Programs for Head and Neck Surgery Patients DOI Creative Commons
Gianluca Marcaccini, Ishith Seth,

Jennifer Novo

и другие.

Technologies, Год журнала: 2025, Номер 13(4), С. 142 - 142

Опубликована: Апрель 4, 2025

Background: Artificial intelligence (AI) and large language models (LLMs) are increasingly used in healthcare, with applications clinical decision-making workflow optimization. In head neck surgery, postoperative rehabilitation is a complex, multidisciplinary process requiring personalized care. This study evaluates the feasibility of using LLMs to generate tailored programs for patients undergoing major surgical procedures. Methods: Ten hypothetical scenarios were developed, representing oncologic resections complex reconstructions. Four LLMs, ChatGPT-4o, DeepSeek V3, Gemini 2, Copilot, prompted identical queries plans. Three senior clinicians independently assessed their quality, accuracy, relevance five-point Likert scale. Readability quality metrics, including DISCERN score, Flesch Reading Ease, Flesch–Kincaid Grade Level, Coleman–Liau Index, applied. Results: ChatGPT-4o achieved highest (Likert mean 4.90 ± 0.32), followed by V3 (4.00 0.82) 2 (3.90 0.74), while Copilot underperformed (2.70 0.82). produced most readable content. A statistical analysis confirmed significant differences across (p < 0.001). Conclusions: can varying readability. clinically relevant plans, generated more AI-generated plans may complement existing protocols, but further validation necessary assess impact on patient outcomes.

Язык: Английский

Процитировано

0

Management of Burns: Multi-Center Assessment Comparing AI Models and Experienced Plastic Surgeons DOI Open Access
Gianluca Marcaccini, Ishith Seth, Bryan Lim

и другие.

Journal of Clinical Medicine, Год журнала: 2025, Номер 14(9), С. 3078 - 3078

Опубликована: Апрель 29, 2025

Background: Burn injuries require accurate assessment for effective management, and artificial intelligence (AI) is gaining attention in burn care diagnosis, treatment planning, decision support. This study compares the effectiveness of AI-driven models with experienced plastic surgeons management. Methods: Ten anonymized images varying severity anatomical location were selected from publicly available databases. Three AI systems (ChatGPT-4o, Claude, Kimi AI) analyzed these images, generating clinical descriptions management plans. reviewed same to establish a reference standard evaluated AI-generated recommendations using five-point Likert scale accuracy, relevance, appropriateness. Statistical analyses, including Cohen’s kappa coefficient, assessed inter-rater reliability comparative accuracy. Results: showed high diagnostic agreement clinicians, ChatGPT-4o achieving highest ratings. However, varied specificity, occasionally lacking individualized considerations. Readability scores indicated that outputs more comprehensible than traditional medical literature, though some overly simplistic. coefficient suggested moderate among human evaluators. Conclusions: While demonstrate strong accuracy readability, further refinements are needed improve specificity personalization. highlights AI’s potential as supplementary tool while emphasizing need oversight ensure safe patient care.

Язык: Английский

Процитировано

0