ChatGPT-4 for addressing patient-centred frequently asked questions in age-related macular degeneration clinical practice DOI Creative Commons
Henrietta Wang,

Amanda Ie,

Thomas Chan

et al.

Eye, Journal Year: 2025, Volume and Issue: unknown

Published: April 15, 2025

Abstract Purpose Large language models have shown promise in answering questions related to medical conditions. This study evaluated the responses of ChatGPT-4 patient-centred frequently asked (FAQs) relevant age-related macular degeneration (AMD). Methods Ten experts across a range clinical, education and research practices optometry ophthalmology. Over 200 patient-centric FAQs from authoritative professional society, hospital advocacy websites were condensed into 37 four themes: definition, causes risk factors, symptoms detection, treatment follow-up. The individually input generate responses. graded by using 5-point Likert scale (1 = strongly disagree; 5 agree) domains: coherency, factuality, comprehensiveness, safety. Results Across all themes domains, median scores 4 (“agree”). Comprehensiveness had lowest domains (mean 3.8 ± 0.8), followed factuality 3.9 safety 4.1 0.8) coherency 4.3 0.7). Examination individual showed that (14%), 21 (57%), 23 (62%) 9 (24%) average below (below “agree”) for comprehensiveness respectively. Free-text comments highlighted issues superseded or older technologies, techniques are not routinely used clinical practice, such as genetic testing. Conclusions AMD generally agreeable terms However, areas weakness identified, precluding recommendations routine use provide patients with tailored counselling AMD.

Language: Английский

Re: ‘Using ChatGPT-4 in visual field test assessment’ DOI
Jack Phu, Henrietta Wang, Michael Kalloniatis

et al.

Clinical and Experimental Optometry, Journal Year: 2025, Volume and Issue: unknown, P. 1 - 2

Published: March 3, 2025

Language: Английский

Citations

0

Coherent Interpretation of Entire Visual Field Test Reports Using a Multimodal Large Language Model (ChatGPT) DOI Creative Commons
Jeremy Tan

Vision, Journal Year: 2025, Volume and Issue: 9(2), P. 33 - 33

Published: April 11, 2025

This study assesses the accuracy and consistency of a commercially available large language model (LLM) in extracting interpreting sensitivity reliability data from entire visual field (VF) test reports for evaluation glaucomatous defects. Single-page anonymised VF 60 eyes subjects were analysed by an LLM (ChatGPT 4o) across four domains-test reliability, defect type, severity overall diagnosis. The main outcome measures extraction, interpretation defects diagnostic classification. displayed 100% extraction global metrics classifying reliability. It also demonstrated high (96.7%) diagnosing whether was consistent with healthy, suspect or eye. correctly defining type moderate (73.3%), which only partially improved when provided more defined region interest. causes incorrect mostly attributed to wrong location, particularly confusing superior inferior hemifields. Numerical/text-based notably image-based demonstrates potential limitations multimodal LLMs processing medical investigation such as reports.

Language: Английский

Citations

0

ChatGPT-4 for addressing patient-centred frequently asked questions in age-related macular degeneration clinical practice DOI Creative Commons
Henrietta Wang,

Amanda Ie,

Thomas Chan

et al.

Eye, Journal Year: 2025, Volume and Issue: unknown

Published: April 15, 2025

Abstract Purpose Large language models have shown promise in answering questions related to medical conditions. This study evaluated the responses of ChatGPT-4 patient-centred frequently asked (FAQs) relevant age-related macular degeneration (AMD). Methods Ten experts across a range clinical, education and research practices optometry ophthalmology. Over 200 patient-centric FAQs from authoritative professional society, hospital advocacy websites were condensed into 37 four themes: definition, causes risk factors, symptoms detection, treatment follow-up. The individually input generate responses. graded by using 5-point Likert scale (1 = strongly disagree; 5 agree) domains: coherency, factuality, comprehensiveness, safety. Results Across all themes domains, median scores 4 (“agree”). Comprehensiveness had lowest domains (mean 3.8 ± 0.8), followed factuality 3.9 safety 4.1 0.8) coherency 4.3 0.7). Examination individual showed that (14%), 21 (57%), 23 (62%) 9 (24%) average below (below “agree”) for comprehensiveness respectively. Free-text comments highlighted issues superseded or older technologies, techniques are not routinely used clinical practice, such as genetic testing. Conclusions AMD generally agreeable terms However, areas weakness identified, precluding recommendations routine use provide patients with tailored counselling AMD.

Language: Английский

Citations

0