Cited by AI in Accounting: Can AI Models Like ChatGPT and Gemini Successfully Pass the Portuguese Chartered Accountant Exam?

Comparison of ChatGPT, Gemini, and Le Chat with physician interpretations of medical laboratory questions from an online health forum DOI

A. Meyer,

Ari Soleman,

Janik Riese

и другие.

Clinical Chemistry and Laboratory Medicine (CCLM), Год журнала: 2024, Номер 62(12), С. 2425 - 2434

Опубликована: Май 28, 2024

Abstract Objectives Laboratory medical reports are often not intuitively comprehensible to non-medical professionals. Given their recent advancements, easier accessibility and remarkable performance on licensing exams, patients therefore likely turn artificial intelligence-based chatbots understand laboratory results. However, empirical studies assessing the efficacy of these in responding real-life patient queries regarding medicine scarce. Methods Thus, this investigation included 100 inquiries from an online health forum, specifically addressing Complete Blood Count interpretation. The aim was evaluate proficiency three (ChatGPT, Gemini Le Chat) against responses certified physicians. Results findings revealed that chatbots’ interpretations results were inferior those While exhibited a higher degree empathetic communication, they frequently produced erroneous or overly generalized complex questions. appropriateness chatbot ranged 51 64 %, with 22 33 % overestimating conditions. A notable positive aspect consistent inclusion disclaimers its nature recommendations seek professional advice. Conclusions real highlight dangerous dichotomy – perceived trustworthiness potentially obscuring factual inaccuracies. growing inclination towards self-diagnosis using AI platforms, further research improvement is imperative increase patients’ awareness avoid future burdens healthcare system.

Язык: Английский

Процитировано

The performance of OpenAI ChatGPT-4 and Google Gemini in virology multiple-choice questions: a comparative analysis of English and Arabic responses DOI

Malik Sallam,

Kholoud Al-Mahzoum,

Rawan Ahmad Almutawaa

и другие.

BMC Research Notes, Год журнала: 2024, Номер 17(1)

Опубликована: Сен. 3, 2024

Язык: Английский

Процитировано

Beyond Green Labels: Assessing Mutual Funds’ ESG Commitments through Large Language Models DOI

Katherine Wood, Chaehyun Pyun,

Hieu Pham

и другие.

Finance research letters, Год журнала: 2025, Номер 74, С. 106713 - 106713

Опубликована: Янв. 5, 2025

Язык: Английский

Процитировано

Chat GPT 4o VS Residents: French Language Evaluation in Ophthalmology DOI

Leah Attal, Elad Shvartz, Nakhoul Nakhoul

и другие.

Deleted Journal, Год журнала: 2025, Номер 2(1), С. 100104 - 100104

Опубликована: Янв. 31, 2025

Язык: Английский

Процитировано

Evaluating the Efficacy of Artificial Intelligence-Driven Chatbots in Addressing Queries on Vernal Conjunctivitis DOI

Muhammad Saad,

Muhammad A Moqeet,

Hassan Mansoor

и другие.

Cureus, Год журнала: 2025, Номер unknown

Опубликована: Фев. 26, 2025

Background Vernal keratoconjunctivitis (VKC) is a recurrent allergic eye disease that requires accurate patient education to ensure proper management. AI-driven chatbots, such as Google Gemini Advanced (Mountain View, California, US), are increasingly being explored potential tools for providing medical information. This study evaluates the accuracy, reliability, and clinical applicability of in addressing VKC-related queries. Objective To assess performance delivering medically relevant information about VKC evaluate its reliability based on expert ratings. Methods A total 125 responses generated by 25 questions were assessed two independent cornea specialists. Responses rated completeness, harm using 5-point Likert scale (1-5). Inter-rater was measured Cronbach's alpha. categorized into highly (score 5), minor inconsistencies 4), inaccurate (scores 1-3). Results demonstrated high inter-rater (Cronbach's alpha = 0.92, 95% CI: 0.87-0.94). Of responses, 108 (86.4%) 5) while 17 (13.6%) had 4) but posed no harm. No classified or potentially harmful. The combined mean score 4.88 ± 0.31, reflecting strong agreement between raters. chatbot consistently provided reliable across diagnostic, treatment, prognosis-related queries, with gaps complex grading treatment-related discussions. Discussion findings support use chatbots like ophthalmology. exhibited accuracy consistency, particularly general However, areas improvement remain, especially detailed guidance treatment protocols ensuring completeness questions. Conclusion demonstrates VKC, making it valuable tool education. While consistent generally accurate, oversight remains necessary refine AI-generated content applications. Further research needed enhance chatbots' ability provide nuanced advice integrate them safely ophthalmic decision-making.

Язык: Английский

Процитировано

Using Large Language Models in the Diagnosis of Acute Cholecystitis: Assessing Accuracy and Guidelines Compliance DOI

Marta Goglia,

Arianna Cicolani,

Francesco Maria Carrano

и другие.

The American Surgeon, Год журнала: 2025, Номер unknown

Опубликована: Март 12, 2025

Background Large language models (LLMs) are advanced tools capable of understanding and generating human-like text. This study evaluated the accuracy several commercial LLMs in addressing clinical questions related to diagnosis management acute cholecystitis, as outlined Tokyo Guidelines 2018 (TG18). We assessed their congruence with expert panel discussions presented guidelines. Methods ChatGPT4.0, Gemini Advanced, GPTo1-preview on ten questions. Eight derived from TG18, two were formulated by authors. Two authors independently rated each LLM’s responses a four-point scale: (1) accurate comprehensive, (2) but not (3) partially accurate, inaccurate, (4) entirely inaccurate. A third author resolved any scoring discrepancies. Then, we comparatively analyzed performance ChatGPT4.0 against newer large (LLMs), specifically Advanced GPTo1-preview, same set delineate respective strengths limitations. Results provided consistent for 90% It delivered “accurate comprehensive” answers 4/10 (40%) 5/10 (50%). One response (10%) was “partially inaccurate.” demonstrated higher some yielded similar percentage inaccurate” responses. Notably, neither model produced “entirely answers. Discussion LLMs, such ChatGPT demonstrate potential accurately regarding cholecystitis. With awareness limitations, careful implementation, ongoing refinement, could serve valuable resources physician education patient information, potentially improving decision-making future.

Язык: Английский

Процитировано

Comparative Analysis of ChatGPT and Google Gemini in Generating Patient Educational Resources on Cardiac Health: A Focus on Exercise-Induced Arrhythmia, Sleep Habits, and Dietary Habits DOI

Nithin Karnan,

Sumaiya Fatima,

Palwasha Nasir

и другие.

Cureus, Год журнала: 2025, Номер unknown

Опубликована: Март 18, 2025

Язык: Английский

Процитировано

Artificial intelligence in academic writing: Enhancing or replacing human expertise? DOI

Ria Resti Fauziah,

Ari Metalin Ika Puspita,

Ivo Yuliana

и другие.

Journal of Clinical Neuroscience, Год журнала: 2025, Номер unknown, С. 111193 - 111193

Опубликована: Март 1, 2025

Язык: Английский

Процитировано

Analysis of Patient Education Guides Generated by ChatGPT and Gemini on Common Anti-diabetic Drugs: A Cross-Sectional Study DOI

Jude Saji,

Aswini Balagangatharan,

Sarita Bajaj

и другие.

Cureus, Год журнала: 2025, Номер unknown

Опубликована: Март 25, 2025

Язык: Английский

Процитировано

The Performance of OpenAI ChatGPT-4 and Google Gemini in Virology Multiple-Choice Questions: A Comparative Analysis of English and Arabic Responses DOI

Malik Sallam,

Kholoud Al-Mahzoum,

Rawan Ahmad Almutawaa

и другие.

Research Square (Research Square), Год журнала: 2024, Номер unknown

Опубликована: Апрель 12, 2024

Abstract Background: The integration of artificial intelligence (AI) in healthcare education is inevitable. Understanding the proficiency generative AI different languages to answer complex questions crucial for educational purposes. Objective: To compare performance ChatGPT-4 and Gemini answering Virology multiple-choice (MCQs) English Arabic, while assessing quality generated content. Methods: Both models’ responses 40 MCQs were assessed correctness based on CLEAR tool designed evaluation AI-generated classified into lower higher cognitive categories revised Bloom’s taxonomy. study design considered METRICS checklist reporting AI-based studies healthcare. Results: performed better compared with consistently surpassing scores. led 80% vs. 62.5% 65% 55% Arabic. For both models, superior domains was reported. Conclusion: Both exhibited potential applications; nevertheless, their varied across highlighting importance continued development ensure effective globally.

Язык: Английский

Процитировано