‘A rather stupid but always available brainstorming partner’: Use and understanding of Generative AI by UK postgraduate researchers DOI Creative Commons
Ross M. English, Rebecca Nash, Heather MacKenzie

et al.

Innovations in Education and Teaching International, Journal Year: 2025, Volume and Issue: unknown, P. 1 - 15

Published: Jan. 2, 2025

Research into the increased use of Generative AI in Higher Education has largely focused on undergraduate study. While many institutions are grappling with implications for doctoral level, there been little published work investigating how postgraduate researchers technology or their attitudes towards it. This paper is based a survey 75 candidates across 19 UK Institutions. The results show that most respondents had used research, common uses being framed as time-saver, editor colleague. There was an awareness limitations and ethical issues connected to but no agreement where those boundaries lie. concludes urgent need sector communication acceptable best practice.

Language: Английский

The now and future of ChatGPT and GPT in psychiatry DOI Creative Commons
Szu‐Wei Cheng,

Chung‐Wen Chang,

Wan‐Jung Chang

et al.

Psychiatry and Clinical Neurosciences, Journal Year: 2023, Volume and Issue: 77(11), P. 592 - 596

Published: Aug. 24, 2023

ChatGPT has sparked extensive discussions within the healthcare community since its November 2022 release. However, potential applications in field of psychiatry have received limited attention. Deep learning proven beneficial to psychiatry, and GPT is a powerful deep learning‐based language model with immense for this field. Despite convenience ChatGPT, advanced chatbot currently practical psychiatry. It may be used support psychiatrists routine tasks such as completing medical records, facilitating communications between clinicians patients, polishing academic writings presentations, programming performing analyses research. The current training application require using appropriate prompts maximize outputs minimize deleterious inaccuracies phantom errors. Moreover, future advances that incorporate empathy, emotion recognition, personality assessment, detection mental health warning signs are essential effective integration into psychiatric care. In near future, developing fully‐automated psychotherapy system trained expert communication (such verbatim) conceivable by building on foundational technology. This dream should integrate ‘real world’ inputs friendly AI user patient interfaces via clinically validated algorithms, voice comprehension/generation modules, discrimination algorithms based facial expressions physiological from wearable devices. addition technology challenges, we believe it critical establish generally accepted ethical standards applying ChatGPT‐related tools all environments, including telemedicine academic/training settings.

Language: Английский

Citations

106

Comparison of ChatGPT–3.5, ChatGPT-4, and Orthopaedic Resident Performance on Orthopaedic Assessment Examinations DOI Creative Commons

Patrick A. Massey,

Carver Montgomery, Andrew S. Zhang

et al.

Journal of the American Academy of Orthopaedic Surgeons, Journal Year: 2023, Volume and Issue: unknown

Published: Sept. 4, 2023

Introduction: Artificial intelligence (AI) programs have the ability to answer complex queries including medical profession examination questions. The purpose of this study was compare performance orthopaedic residents (ortho residents) against Chat Generative Pretrained Transformer (ChatGPT)-3.5 and GPT-4 on assessment examinations. A secondary objective perform a subgroup analysis comparing each group questions that included image interpretation versus text-only Methods: ResStudy question bank used as primary source One hundred eighty choices from nine different subspecialties were directly input into ChatGPT-3.5 then GPT-4. ChatGPT did not consistently available interpretation, so no images provided either AI format. Answers recorded correct incorrect by chatbot, resident based user data ResStudy. Results: Overall, ChatGPT-3.5, GPT-4, ortho scored 29.4%, 47.2%, 74.2%, respectively. There difference among three groups in testing success, with scoring higher than ( P < 0.001 0.001). = 0.002). performed dividing stems without images. more (37.8% vs. 22.4%, respectively, OR 2.1, 0.033) ChatGPT-4 also (61.0% 35.7%, 2.8, 0.001), when Residents 72.6% 75.5% images, significant 0.302). Conclusion: Orthopaedic able accurately is superior for answering Both better It unlikely or would pass American Board Surgery written examination.

Language: Английский

Citations

103

GPT-4 passes the bar exam DOI Creative Commons
Daniel Katz, Michael James Bommarito, Shang Gao

et al.

Philosophical Transactions of the Royal Society A Mathematical Physical and Engineering Sciences, Journal Year: 2024, Volume and Issue: 382(2270)

Published: Feb. 26, 2024

In this paper, we experimentally evaluate the zero-shot performance of GPT-4 against prior generations GPT on entire uniform bar examination (UBE), including not only multiple-choice multistate (MBE), but also open-ended essay exam (MEE) and test (MPT) components. On MBE, significantly outperforms both human test-takers models, demonstrating a 26% increase over ChatGPT beating humans in five seven subject areas. MEE MPT, which have previously been evaluated by scholars, scores an average 4.2/6.0 when compared with much lower for ChatGPT. Graded across UBE components, manner test-taker would be, approximately 297 points, excess passing threshold all jurisdictions. These findings document just rapid remarkable advance large language model generally, potential such models to support delivery legal services society. This article is part theme issue 'A complexity science approach law governance'.

Language: Английский

Citations

78

Evaluating ChatGPT Performance on the Orthopaedic In-Training Examination DOI Creative Commons
Justin Kung, Christopher Marshall, Chase Gauthier

et al.

JBJS Open Access, Journal Year: 2023, Volume and Issue: 8(3)

Published: July 1, 2023

Artificial intelligence (AI) holds potential in improving medical education and healthcare delivery. ChatGPT is a state-of-the-art natural language processing AI model which has shown impressive capabilities, scoring the top percentiles on numerous standardized examinations, including Uniform Bar Exam Scholastic Aptitude Test. The goal of this study was to evaluate performance Orthopaedic In-Training Examination (OITE), an assessment knowledge for orthopedic residents.

Language: Английский

Citations

75

Using ChatGPT for human–computer interaction research: a primer DOI Creative Commons
Wilbert Tabone, Joost de Winter

Royal Society Open Science, Journal Year: 2023, Volume and Issue: 10(9)

Published: Sept. 1, 2023

ChatGPT could serve as a tool for text analysis within the field of Human-Computer Interaction, though its validity requires investigation. This study applied to: (1) textbox questionnaire responses on nine augmented-reality interfaces, (2) interview data from participants who experienced these interfaces in virtual simulator, and (3) transcribed think-aloud viewed real painting replica. Using hierarchical approach, produced scores or summaries batches, which were then aggregated. Results showed that generated sentiment correlated extremely strongly (

Language: Английский

Citations

68

GPT is an effective tool for multilingual psychological text analysis DOI Creative Commons
Steve Rathje, Dan-Mircea Mirea, Ilia Sucholutsky

et al.

Proceedings of the National Academy of Sciences, Journal Year: 2024, Volume and Issue: 121(34)

Published: Aug. 12, 2024

The social and behavioral sciences have been increasingly using automated text analysis to measure psychological constructs in text. We explore whether GPT, the large-language model (LLM) underlying AI chatbot ChatGPT, can be used as a tool for several languages. Across 15 datasets ( n = 47,925 manually annotated tweets news headlines), we tested different versions of GPT (3.5 Turbo, 4, 4 Turbo) accurately detect (sentiment, discrete emotions, offensiveness, moral foundations) across 12 found that r 0.59 0.77) performed much better than English-language dictionary 0.20 0.30) at detecting judged by manual annotators. nearly well as, sometimes than, top-performing fine-tuned machine learning models. Moreover, GPT’s performance improved successive model, particularly lesser-spoken languages, became less expensive. Overall, may superior many existing methods analysis, since it achieves relatively high accuracy requires no training data, is easy use with simple prompts (e.g., “is this negative?”) little coding experience. provide sample code video tutorial analyzing application programming interface. argue other LLMs help democratize making advanced natural language processing capabilities more accessible, facilitate cross-linguistic research understudied

Language: Английский

Citations

51

Comparative Evaluation of LLMs in Clinical Oncology DOI Creative Commons
Nicholas R. Rydzewski, Deepak Dinakaran, Shuang G. Zhao

et al.

NEJM AI, Journal Year: 2024, Volume and Issue: 1(5)

Published: April 16, 2024

As artificial intelligence (AI) tools become widely accessible, more patients and medical professionals will turn to them for information. Large language models (LLMs), a subset of AI, excel in natural processing tasks hold considerable promise clinical use. Fields such as oncology, which decisions are highly dependent on continuous influx new trial data evolving guidelines, stand gain immensely from advancements. It is therefore critical importance benchmark these describe their performance characteristics guide safe application oncology. Accordingly, the primary objectives this work were conduct comprehensive evaluations LLMs field oncology identify characterize strategies that can use bolster confidence model's response.

Language: Английский

Citations

29

The performance of ChatGPT on orthopaedic in-service training exams: A comparative study of the GPT-3.5 turbo and GPT-4 models in orthopaedic education DOI
Michael G. Rizzo, Nathan Cai, David S. Constantinescu

et al.

Journal of Orthopaedics, Journal Year: 2023, Volume and Issue: 50, P. 70 - 75

Published: Nov. 23, 2023

Language: Английский

Citations

32

The model student: GPT-4 performance on graduate biomedical science exams DOI Creative Commons
Daniel Stribling, Yuxing Xia,

Maha K Amer

et al.

Scientific Reports, Journal Year: 2024, Volume and Issue: 14(1)

Published: March 7, 2024

Abstract The GPT-4 large language model (LLM) and ChatGPT chatbot have emerged as accessible capable tools for generating English-language text in a variety of formats. has previously performed well when applied to questions from multiple standardized examinations. However, further evaluation trustworthiness accuracy responses across various knowledge domains is essential before its use reference resource. Here, we assess performance on nine graduate-level examinations the biomedical sciences (seven blinded), finding that scores exceed student average seven cases all four exams. very fill-in-the-blank, short-answer, essay questions, correctly answered several figures sourced published manuscripts. Conversely, poorly with containing simulated data those requiring hand-drawn answer. Two answer-sets were flagged plagiarism based answer similarity some included detailed hallucinations. In addition assessing performance, discuss patterns limitations capabilities goal informing design future academic era.

Language: Английский

Citations

15

ChatGPT: Literate or intelligent about UN sustainable development goals? DOI Creative Commons
Raghu Raman, Hiran H. Lathabai, Santanu Mandal

et al.

PLoS ONE, Journal Year: 2024, Volume and Issue: 19(4), P. e0297521 - e0297521

Published: April 24, 2024

Generative AI tools, such as ChatGPT, are progressively transforming numerous sectors, demonstrating a capacity to impact human life dramatically. This research seeks evaluate the UN Sustainable Development Goals (SDGs) literacy of which is crucial for diverse stakeholders involved in SDG-related policies. Experimental outcomes from two widely used Sustainability Assessment tests–the SDG Fitness Test and Literacy (SULITEST) ‐ suggest that ChatGPT exhibits high literacy, yet its comprehensive intelligence needs further exploration. The gauges eight vital competencies across introductory, intermediate, advanced levels. Accurate mapping these test questions essential partial evaluation intelligence. To assess intelligence, both tests were mapped 17 SDGs cross-cutting core competencies, but questionnaires found be insufficient. SULITEST could satisfactorily map only 5 out 8 whereas managed 6 8. Regarding coverage SULITEST, their SDGs, fell short. Most underrepresented instruments, with certain not represented at all. Consequently, tools proved ineffective assessing through coverage. study recommends future versions enhance collaboration, critical thinking, systems others achieve SDGs. It concludes while models like hold considerable potential sustainable development, usage must approached carefully, considering current limitations ethical implications.

Language: Английский

Citations

14