Accuracy of Online Symptom-Assessment Applications, Large Language Models, and Laypeople for Self-Triage Decisions: A Systematic Review DOI Creative Commons
Marvin Kopka, Niklas von Kalckreuth, Markus A. Feufel

и другие.

medRxiv (Cold Spring Harbor Laboratory), Год журнала: 2024, Номер unknown

Опубликована: Сен. 14, 2024

Abstract Symptom-Assessment Application (SAAs, e.g., NHS 111 online) that assist medical laypeople in deciding if and where to seek care ( self-triage ) are gaining popularity their accuracy has been examined numerous studies. With the public release of Large Language Models (LLMs, ChatGPT), use such decision-making processes is growing as well. However, there currently no comprehensive evidence synthesis for LLMs, review contextualized SAAs LLMs relative users. Thus, this systematic evaluates both compares them laypeople. A total 1549 studies were screened, with 19 included final analysis. The was found be moderate but highly variable (11.5 – 90.0%), while (57.8 76.0%) (47.3 62.4%) low variability. Despite some published recommendations standardize evaluation methodologies, remains considerable heterogeneity among should not universally recommended or discouraged; rather, utility assessed based on specific case tool under consideration.

Язык: Английский

Accuracy of online symptom assessment applications, large language models, and laypeople for self–triage decisions DOI Creative Commons
Marvin Kopka, Niklas von Kalckreuth, Markus A. Feufel

и другие.

npj Digital Medicine, Год журнала: 2025, Номер 8(1)

Опубликована: Март 25, 2025

Abstract Symptom-Assessment Application (SAAs, e.g., NHS 111 online) that assist laypeople in deciding if and where to seek care ( self-triage ) are gaining popularity Large Language Models (LLMs) increasingly used too. However, there is no evidence synthesis on the accuracy of LLMs, review has contextualized SAAs LLMs. This systematic evaluates both LLMs compares them laypeople. A total 1549 studies were screened 19 included. The was moderate but highly variable (11.5–90.0%), while (57.8–76.0%) (47.3–62.4%) with low variability. Based available evidence, use or should neither be universally recommended nor discouraged; rather, we suggest their utility assessed based specific case user group under consideration.

Язык: Английский

Процитировано

0

Accuracy of Online Symptom-Assessment Applications, Large Language Models, and Laypeople for Self-Triage Decisions: A Systematic Review DOI Creative Commons
Marvin Kopka, Niklas von Kalckreuth, Markus A. Feufel

и другие.

medRxiv (Cold Spring Harbor Laboratory), Год журнала: 2024, Номер unknown

Опубликована: Сен. 14, 2024

Abstract Symptom-Assessment Application (SAAs, e.g., NHS 111 online) that assist medical laypeople in deciding if and where to seek care ( self-triage ) are gaining popularity their accuracy has been examined numerous studies. With the public release of Large Language Models (LLMs, ChatGPT), use such decision-making processes is growing as well. However, there currently no comprehensive evidence synthesis for LLMs, review contextualized SAAs LLMs relative users. Thus, this systematic evaluates both compares them laypeople. A total 1549 studies were screened, with 19 included final analysis. The was found be moderate but highly variable (11.5 – 90.0%), while (57.8 76.0%) (47.3 62.4%) low variability. Despite some published recommendations standardize evaluation methodologies, remains considerable heterogeneity among should not universally recommended or discouraged; rather, utility assessed based on specific case tool under consideration.

Язык: Английский

Процитировано

0