
Frontiers in Digital Health, Год журнала: 2025, Номер 7
Опубликована: Март 3, 2025
Background Artificial intelligence (AI) has made great strides. To explore the potential of Large Language Models (LLMs) in providing medical services to patients and assisting physicians clinical practice, our study evaluated performance delivering questions related autoimmune diseases. Methods 46 diseases were input into ChatGPT 3.5, 4.0, Gemini. The responses then by rheumatologists based on five quality dimensions: relevance, correctness, completeness, helpfulness, safety. Simultaneously, assessed laboratory specialists across six fields: concept, features, report interpretation, diagnosis, prevention treatment, prognosis. Finally, statistical analysis comparisons performed three chatbots dimensions fields. Results 4.0 outperformed both 3.5 Gemini all dimensions, with an average score 199.8 ± 10.4, significantly higher than (175.7 16.6) (179.1 11.8) ( p = 0.009 0.001, respectively). differences between these not statistically significant. Specifically, demonstrated superior relevance < 0.0001, 0.0001), completeness 0.0006), correctness 0.0002), helpfulness safety 0.0025) compared Furthermore, scored fields such as interpretation 0.0025), treatment 0.0103), prognosis 0.0458, 0.0458). Conclusions This demonstrates that outperforms addressing diseases, showing notable advantages domains. These findings further highlight large language models enhancing healthcare services.
Язык: Английский