
World Neurosurgery, Год журнала: 2025, Номер 196, С. 123755 - 123755
Опубликована: Март 6, 2025
Artificial intelligence tools like ChatGPT have gained attention for their potential to support patient education by providing accessible, evidence-based information. This study compares the performance of 3.5 and 4.0 in answering common questions about low back pain, focusing on response quality, readability, adherence clinical guidelines, while also addressing models' limitations managing psychosocial concerns. Thirty frequently asked pain were categorized into 4 groups: Diagnosis, Treatment, Psychosocial Factors, Management Approaches. Responses generated evaluated 3 key metrics: 1) quality: rated a scale 1 (excellent) (unsatisfactory); 2) DISCERN criteria: evaluating reliability with scores ranging from (low reliability) 5 (high reliability; 3) readability: assessed using 7 readability formulas, including Flesch-Kincaid Gunning Fog Index. significantly outperformed quality across all categories, mean score 1.03 compared 2.07 (P < 0.001). demonstrated higher (4.93 vs. 4.00, P However, both versions struggled factor questions, where responses lower than = 0.04). concerns highlight need clinician oversight, particularly emotionally sensitive issues. Enhancing artificial intelligence's capability aspects care should be priority future iterations.
Язык: Английский