Feasibility of GPT-3.5 versus Machine Learning for Automated Surgical Decision-Making Determination: A Multicenter Study on Suspected Appendicitis DOI Creative Commons
Sebastian Sanduleanu,

Koray Ersahin,

Johannes Bremm

et al.

AI, Journal Year: 2024, Volume and Issue: 5(4), P. 1942 - 1954

Published: Oct. 16, 2024

Background: Nonsurgical treatment of uncomplicated appendicitis is a reasonable option in many cases despite the sparsity robust, easy access, externally validated, and multimodally informed clinical decision support systems (CDSSs). Developed by OpenAI, Generative Pre-trained Transformer 3.5 model (GPT-3) may provide enhanced for surgeons less certain or those posing higher risk (relative) operative contra-indications. Our objective was to determine whether GPT-3.5, when provided high-throughput clinical, laboratory, radiological text-based information, will come decisions similar machine learning board-certified surgeon (reference standard) decision-making appendectomy versus conservative treatment. Methods: In this cohort study, we randomly collected patients presenting at emergency department (ED) two German hospitals (GFO, Troisdorf, University Hospital Cologne) with right abdominal pain between October 2022 2023. Statistical analysis performed using R, version 3.6.2, on RStudio, 2023.03.0 + 386. Overall agreement GPT-3.5 output reference standard assessed means inter-observer kappa values as well accuracy, sensitivity, specificity, positive negative predictive “Caret” “irr” packages. significance defined p < 0.05. Results: There surgeon’s 102 113 cases, all where decided upon were correctly classified GPT-3.5. The estimated training accuracy 83.3% (95% CI: 74.0, 90.4), while validation 87.0% 66.4, 97.2). This comparison 90.3% 83.2, 95.0), which did not perform significantly better (p = 0.21). Conclusions: first study “intended use” surgical our knowledge, comparing an algorithm found high degree lower pain.

Language: Английский

Using Generative Pre-Trained Transformer-4 (GPT-4), ffmpeg, and Microsoft Azure to Aid in Creating a Text-to-Video Generation Tool to Improve Safety Shares and Incident Descriptions in the Mining Industry DOI
Tulio Dias de Almeida, Nelson Oliveira,

Chun Lin He

et al.

Mining Metallurgy & Exploration, Journal Year: 2025, Volume and Issue: unknown

Published: March 27, 2025

Language: Английский

Citations

0

Conversational Guide for Cataract Surgery Complications: A Comparative Study of Surgeons versus Large Language Model-Based Chatbot Generated Instructions for Patient Interaction DOI

Sathishkumar Sundaramoorthy,

Vineet Ratra, Vijay Shankar S

et al.

Ophthalmic Epidemiology, Journal Year: 2025, Volume and Issue: unknown, P. 1 - 8

Published: April 2, 2025

It is difficult to explain the complications of surgery patients. Care has be taken convey facts clearly and objectively while expressing concern for their wellbeing. This study compared responses from surgeons with a large language model (LLM)-based chatbot. We presented 10 common scenarios cataract seven senior The were graded by two independent graders comprehension, readability, complexity using previously validated indices. analyzed accuracy completeness. Honesty empathy both groups. Scores averaged tabulated. readability scores (10.64) significantly less complex than chatbot (12.54) (p < 0.001). shorter, whereas tended give more detailed answers. average completeness score chatbot-generated conversations was 2.36 (0.55), which similar surgeons' 2.58 (0.36) = 0.164). generalized, lacking specific alternative measures. While higher (1.81 vs. 1.20, p 0.041), honesty showed no significant difference. LLM-based gave description complication but about had in-depth understanding situation. complete scored empathy. With training real-world specialized ophthalmologic data, chatbots could used assist in counselling patients postoperative complications.

Language: Английский

Citations

0

Integrating domain-specific knowledge and fine-tuned general-purpose large language models for question-answering in construction engineering management DOI
Shenghua Zhou, Xiaoyang Liu, Dezhi Li

et al.

Automation in Construction, Journal Year: 2025, Volume and Issue: 175, P. 106206 - 106206

Published: April 21, 2025

Language: Английский

Citations

0

The role of artificial intelligence in medical education: an evaluation of Large Language Models (LLMs) on the Turkish Medical Specialty Training Entrance Exam DOI Creative Commons
Murat Koçak, Ali Kemal Oğuz, Zafer Akçalı

et al.

BMC Medical Education, Journal Year: 2025, Volume and Issue: 25(1)

Published: April 25, 2025

Abstract Objective To evaluate the performance of advanced large language models (LLMs)—OpenAI-ChatGPT 4, Google AI-Gemini 1.5 Pro, Cohere-Command R + and Meta AI-Llama 3 70B on questions from Turkish Medical Specialty Training Entrance Exam (2021, 1st semester) analyze their answers for user interpretability in languages other than English. Methods The study used Basic Sciences Clinical exams held March 21, 2021. 240 were presented to LLMs Turkish, responses evaluated based official published by Student Selection Placement Centre. Results ChatGPT 4 was best-performing model with an overall accuracy 88.75%. Llama followed closely 79.17% accuracy. Gemini Pro achieved 78.13% accuracy, while Command lagged 50% demonstrated strengths both basic clinical medical science questions. Performance varied across question difficulties, maintaining high even most challenging Conclusions GPT-4 satisfactory results Exam, demonstrating potential as safe sources sciences knowledge These could be valuable resources education support non-English speaking areas. However, show but need significant improvement compete models.

Language: Английский

Citations

0

'Always Nice and Confident, Sometimes Wrong': Developer's Experiences Engaging Generative AI Chatbots Versus Human-Powered Q&A Platforms DOI
Jiachen Li, Elizabeth D. Mynatt, Varun Mishra

et al.

Proceedings of the ACM on Human-Computer Interaction, Journal Year: 2025, Volume and Issue: 9(2), P. 1 - 22

Published: May 2, 2025

Software engineers have historically relied on human-powered Q&A platforms like Stack Overflow (SO) as coding aids. With the rise of generative AI, developers started to adopt AI chatbots, such ChatGPT, in their software development process. Recognizing potential parallels between and AI-powered question-based we investigate compare how integrate this assistance into real-world experiences by conducting a thematic analysis 1700+ Reddit posts. Through comparative study SO identified each platform's strengths, use cases, barriers. Our findings suggest that ChatGPT offers fast, clear, comprehensive responses fosters more respectful environment than SO. However, concerns about ChatGPT's reliability stem from its overly confident tone absence validation mechanisms SO's voting system. Based these findings, synthesized design implications for future GenAI code assistants recommend workflow leveraging unique features improve developer experiences.

Language: Английский

Citations

0

The Intersection of Generative AI and Healthcare: Addressing Challenges to Enhance Patient Care DOI
Elham Albaroudi, Taha Mansouri, Ali Alameer

et al.

Published: March 3, 2024

Language: Английский

Citations

3

Performance of 5 Prominent Large Language Models in Surgical Knowledge Evaluation: A Comparative Analysis DOI Creative Commons
Adam M. Ostrovsky, Joshua Chen,

Vishal N. Shah

et al.

Mayo Clinic Proceedings Digital Health, Journal Year: 2024, Volume and Issue: 2(3), P. 348 - 350

Published: June 5, 2024

Language: Английский

Citations

2

Evaluating the Performance of ChatGPT 3.5 and 4.0 on StatPearls Oculoplastic Surgery Text- and Image-Based Exam Questions DOI Open Access

Gurnoor S Gill,

Jacob Blair,

S M Litinsky

et al.

Cureus, Journal Year: 2024, Volume and Issue: unknown

Published: Nov. 16, 2024

The emergence of large language models (LLMs) has led to significant interest in their potential use as medical assistive tools. Prior investigations have analyzed the overall comparative performance LLM versions within different ophthalmology subspecialties. However, limited characterized on image-based questions, a recent advance capabilities. purpose this study was evaluate Chat Generative Pre-Trained Transformers (ChatGPT) 3.5 and 4.0 text-only questions using oculoplastic subspecialty from StatPearls OphthoQuestions question banks.

Language: Английский

Citations

2

The scientific knowledge of three large language models in cardiology: multiple-choice questions examination-based performance DOI Open Access
Ibraheem Altamimi,

Abdullah Alhumimidi,

Salem Alshehri

et al.

Annals of Medicine and Surgery, Journal Year: 2024, Volume and Issue: unknown

Published: May 3, 2024

Background: The integration of artificial intelligence (AI) chatbots like Google’s Bard, OpenAI’s ChatGPT, and Microsoft’s Bing Chatbot into academic professional domains, including cardiology, has been rapidly evolving. Their application in educational research frameworks, however, raises questions about their efficacy, particularly specialized fields cardiology. This study aims to evaluate the knowledge depth accuracy these AI cardiology using a multiple-choice question (MCQ) format. Methods: was conducted as an exploratory, cross-sectional November 2023 on bank 100 MCQs covering various topics that created from authoritative textbooks banks. These were then used assess level Microsoft Bing, ChatGPT 4.0. Each entered manually chatbots, ensuring no memory retention bias. Results: found 4.0 demonstrated highest score with 87% accuracy, followed by at 60% Bard 46%. performance varied across different subtopics, consistently outperforming others. Notably, revealed significant differences proficiency specific domains. Conclusion: highlights spectrum efficacy among disseminating knowledge. emerged potential auxiliary resource surpassing traditional learning methods some aspects. However, variability systems underscores need for cautious evaluation continuous improvement, especially ensure reliability medical dissemination.

Language: Английский

Citations

2

Exploring the role of AI in classifying, analyzing, and generating case reports on assisted suicide cases: feasibility and ethical implications DOI Creative Commons
Giovanni Spitale, Gerold Schneider, Federico Germani

et al.

Frontiers in Artificial Intelligence, Journal Year: 2023, Volume and Issue: 6

Published: Dec. 14, 2023

This paper presents a study on the use of AI models for classification case reports assisted suicide procedures. The database five Dutch regional bioethics committees was scraped to collect 72 available in English. We trained several according categories defined by Termination Life Request and Assisted Suicide (Review Procedures) Act. also conducted related project fine-tune an OpenAI GPT-3.5-turbo large language model generating new fictional but plausible cases. As is increasingly being used judgement, it possible imagine application decision-making regarding suicide. Here we explore two arising questions: feasibility ethics, with aim contributing critical assessment potential role highly sensitive areas.

Language: Английский

Citations

5