Examining the Health Information Quality and Accuracy of Conversational Agents and Generative AI Models in Response to Prompts Regarding Low Back Pain (Preprint) DOI Creative Commons
Leo Li, Alessandra C. Marcelo,

Curtis Cheuk Him Yu

et al.

Published: May 12, 2024

BACKGROUND Low back pain (LBP) is a significant global public health concern with large burden of disease on the population. With increasing integration AI technologies in healthcare, it essential to evaluate their effectiveness providing high-quality and accurate information when addressing common LBP concerns. OBJECTIVE The purpose this research examine quality accuracy conversational agents (CAs) generative (GAI) models response questions about LBP. METHODS A systematic evaluation was conducted four commonly used CAs two GAI using piloted script 25 prompts covering various aspects LBP, including causes, treatment, ability exercise work, imaging. responses were compiled transcribed assess accuracy. assessed JAMA benchmark criteria DISCERN tool. by comparing them UK NICE Back Pain Sciatica guidelines Australian Lower Clinical Care Standard. RESULTS study revealed variation both across different models. Overall, exhibited poor but moderate Siri demonstrated best overall performance based combination scores, whereas voice-only performed worst measures. had highest lower overall. CONCLUSIONS findings highlight necessity for improvements delivery ensure received reliable up-to-date regarding issues such as CLINICALTRIAL N/A

Language: Английский

Artificial Intelligence and the Metaverse in Sport: Emerging Trends and Future Directions from a Bibliometric Analysis DOI Open Access
Yusuf Esmer

Spor Bilimleri Dergisi, Journal Year: 2025, Volume and Issue: 36(1), P. 49 - 65

Published: March 28, 2025

In recent years, artificial intelligence (AI) and metaverse technologies have found applications in areas such as athlete performance analysis, fan engagement virtual event management, leading to a growing volume of research these areas. The aim this study is reveal the bibliometric profile scientific on AI sports increase knowledge at intersection disciplines by analyzing production trends field. For purpose, 255 publications between 1992-2025 Web Science (WoS) core collection period October 2024 were examined analysis technique. As result it was concluded that studies increased over years attracted more attention. Especially high 2022 citation rate 2013 revealed potential importance fields. It possible say important terms identifying new research/application areas, emphasizing international collaborations, providing information/data for academia/the industry, encouraging interdisciplinary approaches, guiding future contributing relevant literature. studies, considered necessary conduct develop practical projects. context, recommended explain effects how can be used organizations.

Language: Английский

Citations

0

Chatbots are Not Yet Safe for Emergency Care Patient Use: Deficiencies of AI Responses to Clinical Questions (Preprint) DOI Creative Commons

J. Yau,

Soheil Saadat, Edmund Hsu

et al.

Published: May 7, 2024

BACKGROUND Recent surveys indicate that 48% of consumers actively use generative artificial intelligence (AI) for health-related inquiries. Despite widespread adoption and the potential to improve health care access, scant research examines performance AI chatbot responses regarding emergency advice. OBJECTIVE We assessed quality common questions. sought determine qualitative differences in from 4 free-access chatbots, 10 different serious benign conditions. METHODS created questions we fed into versions ChatGPT 3.5 (OpenAI), Google Bard, Bing Chat (Microsoft), Claude (Anthropic) on November 26, 2023. Each response was graded by 5 board-certified medicine (EM) faculty 8 domains percentage accuracy, presence dangerous information, factual clarity, completeness, understandability, source reliability, relevancy. determined correct, complete reputable scholarly medical references. These were compiled an EM resident physician. For readability responses, used Flesch-Kincaid Grade Level each statistics embedded Microsoft Word. Differences between chatbots chi-square test. RESULTS chatbots’ clinical scored across faculty, 400 assessments chatbot. Together, had best clarity understandability (both 85%), intermediate accuracy completeness 50%), poor (10%) relevance reliability (mostly unreported). Chatbots contained information 5% 35% with no statistical difference this metric (<i>P</i>=.24). ChatGPT, Claud similar performances 6 out domains. Only performed better more identified or relevant sources (40%; others 0%-10%). Reading level 7.7-8.9 grade all except at 10.8, which too advanced average patients. Responses included both (eg, starting cardiopulmonary resuscitation pulse check) generally inappropriate advice loosening collar breathing without evidence airway compromise). CONCLUSIONS though ubiquitous, have significant deficiencies patient advice, despite relatively consistent performance. Information when seek urgent emergent is frequently incomplete inaccurate, patients may be unaware misinformation. Sources are not provided. Patients who guide decisions assume risks. should subject further research, refinement, regulation. strongly recommend proper consultation prevent adverse outcomes. CLINICALTRIAL

Language: Английский

Citations

0

Examining the Health Information Quality and Accuracy of Conversational Agents and Generative AI Models in Response to Prompts Regarding Low Back Pain (Preprint) DOI Creative Commons
Leo Li, Alessandra C. Marcelo,

Curtis Cheuk Him Yu

et al.

Published: May 12, 2024

BACKGROUND Low back pain (LBP) is a significant global public health concern with large burden of disease on the population. With increasing integration AI technologies in healthcare, it essential to evaluate their effectiveness providing high-quality and accurate information when addressing common LBP concerns. OBJECTIVE The purpose this research examine quality accuracy conversational agents (CAs) generative (GAI) models response questions about LBP. METHODS A systematic evaluation was conducted four commonly used CAs two GAI using piloted script 25 prompts covering various aspects LBP, including causes, treatment, ability exercise work, imaging. responses were compiled transcribed assess accuracy. assessed JAMA benchmark criteria DISCERN tool. by comparing them UK NICE Back Pain Sciatica guidelines Australian Lower Clinical Care Standard. RESULTS study revealed variation both across different models. Overall, exhibited poor but moderate Siri demonstrated best overall performance based combination scores, whereas voice-only performed worst measures. had highest lower overall. CONCLUSIONS findings highlight necessity for improvements delivery ensure received reliable up-to-date regarding issues such as CLINICALTRIAL N/A

Language: Английский

Citations

0