Evaluating Large Language Models for Burning Mouth Syndrome Diagnosis DOI Creative Commons
Takayuki Suga, Osamu Uehara, Yoshihiro Abiko

et al.

Journal of Pain Research, Journal Year: 2025, Volume and Issue: Volume 18, P. 1387 - 1405

Published: March 1, 2025

Large language models have been proposed as diagnostic aids across various medical fields, including dentistry. Burning mouth syndrome, characterized by burning sensations in the oral cavity without identifiable cause, poses challenges. This study explores accuracy of large identifying hypothesizing potential limitations. Clinical vignettes 100 synthesized syndrome cases were evaluated using three (ChatGPT-4o, Gemini Advanced 1.5 Pro, and Claude 3.5 Sonnet). Each vignette included patient demographics, symptoms, history. prompted to provide a primary diagnosis, differential diagnoses, their reasoning. Accuracy was determined comparing responses with expert evaluations. ChatGPT achieved an rate 99%, while Gemini's 89% (p < 0.001). Misdiagnoses Persistent Idiopathic Facial Pain combined diagnoses inappropriate conditions. Differences also observed reasoning patterns additional data requests models. Despite high overall accuracy, exhibited variations approaches occasional errors, underscoring importance clinician oversight. Limitations include nature vignettes, over-reliance on exclusionary criteria, challenges differentiating overlapping disorders. demonstrate strong supplementary tools for especially settings lacking specialist expertise. However, reliability depends thorough assessment verification. Integrating into routine diagnostics could enhance early detection management, ultimately improving clinical decision-making dentists specialists alike.

Language: Английский

Artificial intelligence in rheumatology research: what is it good for? DOI Creative Commons
José Miguel Sequí-Sabater, Diego Benavent

RMD Open, Journal Year: 2025, Volume and Issue: 11(1), P. e004309 - e004309

Published: Jan. 1, 2025

Artificial intelligence (AI) is transforming rheumatology research, with a myriad of studies aiming to improve diagnosis, prognosis and treatment prediction, while also showing potential capability optimise the research workflow, drug discovery clinical trials. Machine learning, key element discriminative AI, has demonstrated ability accurately classifying rheumatic diseases predicting therapeutic outcomes by using diverse data types, including structured databases, imaging text. In parallel, generative driven large language models, becoming powerful tool for optimising workflow supporting content generation, literature review automation decision support. This explores current applications future both AI in rheumatology. It highlights challenges posed these technologies, such as ethical concerns need rigorous validation regulatory oversight. The integration promises substantial advancements but requires balanced approach benefits minimise possible downsides.

Language: Английский

Citations

1

Rheumatology in the digital health era: status quo and quo vadis? DOI
Johannes Knitza, Latika Gupta, Thomas Hügle

et al.

Nature Reviews Rheumatology, Journal Year: 2024, Volume and Issue: 20(12), P. 747 - 759

Published: Oct. 31, 2024

Language: Английский

Citations

6

Artificial Intelligence in Peer Review: Enhancing Efficiency While Preserving Integrity DOI Creative Commons
Bohdana Doskaliuk, Olena Zimba, Marlen Yessirkepov

et al.

Journal of Korean Medical Science, Journal Year: 2025, Volume and Issue: 40(7)

Published: Jan. 1, 2025

The rapid advancement of artificial intelligence (AI) has transformed various aspects scientific research, including academic publishing and peer review. In recent years, AI tools such as large language models have demonstrated their capability to streamline numerous tasks traditionally handled by human editors reviewers. These applications range from automated grammar checks plagiarism detection, format compliance, even preliminary assessment research significance. While substantially benefits the efficiency accuracy processes, its integration raises critical ethical methodological questions, particularly in lacks subtle understanding complex content that expertise provides, posing challenges evaluating novelty Additionally, there are risks associated with over-reliance on AI, potential biases algorithms, concerns related transparency, accountability, data privacy. This review evaluates perspectives within community integrating publishing. By exploring both AI's limitations, we aim offer practical recommendations ensure is used a supportive tool, supporting but not replacing expertise. Such guidelines essential for preserving integrity quality work while benefiting efficiencies editorial processes.

Language: Английский

Citations

0

Evaluating Large Language Models for Burning Mouth Syndrome Diagnosis DOI Creative Commons
Takayuki Suga, Osamu Uehara, Yoshihiro Abiko

et al.

Journal of Pain Research, Journal Year: 2025, Volume and Issue: Volume 18, P. 1387 - 1405

Published: March 1, 2025

Large language models have been proposed as diagnostic aids across various medical fields, including dentistry. Burning mouth syndrome, characterized by burning sensations in the oral cavity without identifiable cause, poses challenges. This study explores accuracy of large identifying hypothesizing potential limitations. Clinical vignettes 100 synthesized syndrome cases were evaluated using three (ChatGPT-4o, Gemini Advanced 1.5 Pro, and Claude 3.5 Sonnet). Each vignette included patient demographics, symptoms, history. prompted to provide a primary diagnosis, differential diagnoses, their reasoning. Accuracy was determined comparing responses with expert evaluations. ChatGPT achieved an rate 99%, while Gemini's 89% (p < 0.001). Misdiagnoses Persistent Idiopathic Facial Pain combined diagnoses inappropriate conditions. Differences also observed reasoning patterns additional data requests models. Despite high overall accuracy, exhibited variations approaches occasional errors, underscoring importance clinician oversight. Limitations include nature vignettes, over-reliance on exclusionary criteria, challenges differentiating overlapping disorders. demonstrate strong supplementary tools for especially settings lacking specialist expertise. However, reliability depends thorough assessment verification. Integrating into routine diagnostics could enhance early detection management, ultimately improving clinical decision-making dentists specialists alike.

Language: Английский

Citations

0