Journal of Nuclear Cardiology, Journal Year: 2024, Volume and Issue: unknown, P. 102089 - 102089
Published: Nov. 1, 2024
Language: Английский
Journal of Nuclear Cardiology, Journal Year: 2024, Volume and Issue: unknown, P. 102089 - 102089
Published: Nov. 1, 2024
Language: Английский
Medicine, Journal Year: 2024, Volume and Issue: 103(9), P. e37325 - e37325
Published: March 1, 2024
Large language models (LLMs) have been deployed in diverse fields, and the potential for their application medicine has explored through numerous studies. This study aimed to evaluate compare performance of ChatGPT-3.5, ChatGPT-4, Bing Chat, Bard Emergency Medicine Board Examination question bank Korean language. Of 2353 questions bank, 150 were randomly selected, 27 containing figures excluded. Questions that required abilities such as analysis, creative thinking, evaluation, synthesis classified higher-order questions, those only recall, memory, factual information response lower-order questions. The answers explanations obtained by inputting 123 into LLMs analyzed compared. ChatGPT-4 (75.6%) Chat (70.7%) showed higher correct rates than ChatGPT-3.5 (56.9%) (51.2%). highest rate at 76.5%, 71.4%. appropriateness explanation answer was significantly (75.6%, 68.3%, 52.8%, 50.4%, respectively). outperformed answering a random selection
Language: Английский
Citations
10BJA Open, Journal Year: 2024, Volume and Issue: 10, P. 100296 - 100296
Published: June 1, 2024
Language: Английский
Citations
9Academic Radiology, Journal Year: 2024, Volume and Issue: unknown
Published: Sept. 1, 2024
Language: Английский
Citations
9Frontiers in Digital Health, Journal Year: 2025, Volume and Issue: 7
Published: Feb. 3, 2025
Introduction Artificial intelligence and machine learning are popular interconnected technologies. AI chatbots like ChatGPT Gemini show considerable promise in medical inquiries. This scoping review aims to assess the accuracy response length (in characters) of applications. Methods The eligible databases were searched find studies published English from January 1 October 20, 2023. inclusion criteria consisted that focused on using medicine assessed outcomes based character count (length) Gemini. Data collected included first author's name, country where study was conducted, type design, publication year, sample size, speciality, length. Results initial search identified 64 papers, with 11 meeting criteria, involving 1,177 samples. showed higher radiology (87.43% vs. Gemini's 71%) shorter responses (907 1,428 characters). Similar trends noted other specialties. However, outperformed emergency scenarios (87% 77%) renal diets low potassium high phosphorus (79% 60% 100% 77%). Statistical analysis confirms has greater than studies, a p -value <.001 for both metrics. Conclusion Scoping suggests may demonstrate provide studies.
Language: Английский
Citations
1Cureus, Journal Year: 2023, Volume and Issue: unknown
Published: Dec. 12, 2023
Purpose This study aims to evaluate the performance of three large language models (LLMs), Generative Pre-trained Transformer (GPT)-3.5, GPT-4, and Google Bard, on 2023 Japanese National Dentist Examination (JNDE) assess their potential clinical applications in Japan. Methods A total 185 questions from JNDE were used. These categorized by question type category. McNemar's test compared correct response rates between two LLMs, while Fisher's exact evaluated LLMs each Results The overall 73.5% for 66.5% 51.9% GPT-3.5. GPT-4 showed a significantly higher rate than Bard In category essential questions, achieved 80.5%, surpassing passing criterion 80%. contrast, both GPT-3.5 fell short this benchmark, with attaining 77.6% only 52.5%. scores that (p<0.01). For general 71.2% 58.5% 52.5% outperformed professional dental 51.6% 45.3% 35.9% differences among not statistically significant. All demonstrated lower accuracy dentistry other types Conclusions highest score JNDE, followed However, surpassed questions. To further understand application worldwide, more research examinations across different languages is required.
Language: Английский
Citations
23Clinical Rheumatology, Journal Year: 2024, Volume and Issue: 43(11), P. 3507 - 3513
Published: Sept. 28, 2024
Language: Английский
Citations
8Dentomaxillofacial Radiology, Journal Year: 2024, Volume and Issue: 53(6), P. 390 - 395
Published: June 7, 2024
Abstract Objectives This study evaluated the performance of four large language model (LLM)-based chatbots by comparing their test results with those dental students on an oral and maxillofacial radiology examination. Methods ChatGPT, ChatGPT Plus, Bard, Bing Chat were tested 52 questions from regular college examinations. These categorized into three educational content areas: basic knowledge, imaging equipment, image interpretation. They also classified as multiple-choice (MCQs) short-answer (SAQs). The accuracy rates compared students, further analysis was conducted based question type. Results students’ overall rate 81.2%, while that varied: 50.0% for 65.4% 63.5% Chat. Plus achieved a higher knowledge than (93.8% vs. 78.7%). However, all performed poorly in interpretation, below 35.0%. All scored less 60.0% MCQs, but better SAQs. Conclusions unsatisfactory. Further training using specific, relevant data derived solely reliable sources is required. Additionally, validity these chatbots’ responses must be meticulously verified.
Language: Английский
Citations
7Pediatric Radiology, Journal Year: 2024, Volume and Issue: 54(10), P. 1729 - 1737
Published: Aug. 12, 2024
Language: Английский
Citations
7CardioVascular and Interventional Radiology, Journal Year: 2024, Volume and Issue: 47(6), P. 836 - 837
Published: Feb. 22, 2024
Language: Английский
Citations
6Journal of College of Physicians And Surgeons Pakistan, Journal Year: 2024, Volume and Issue: unknown, P. 761 - 766
Published: July 1, 2024
Objective: To compare the knowledge accuracy of ChatGPT-4 and Google Bard in response to knowledge-based questions related orthodontic diagnosis treatment modalities.
Language: Английский
Citations
6