Cited by Assessing the Use of ChatGPT among Agri-Food Researchers: A Global Perspective

Testing and Evaluation of Health Care Applications of Large Language Models DOI

Suhana Bedi, Yutong Liu, Lucy Orr-Ewing

et al.

JAMA, Journal Year: 2024, Volume and Issue: unknown

Published: Oct. 15, 2024

Importance Large language models (LLMs) can assist in various health care activities, but current evaluation approaches may not adequately identify the most useful application areas. Objective To summarize existing evaluations of LLMs terms 5 components: (1) data type, (2) task, (3) natural processing (NLP) and understanding (NLU) tasks, (4) dimension evaluation, (5) medical specialty. Data Sources A systematic search PubMed Web Science was performed for studies published between January 1, 2022, February 19, 2024. Study Selection Studies evaluating 1 or more care. Extraction Synthesis Three independent reviewers categorized via keyword searches based on used, NLP NLU dimensions Results Of 519 reviewed, 2024, only 5% used real patient LLM evaluation. The common tasks were assessing knowledge such as answering licensing examination questions (44.5%) making diagnoses (19.5%). Administrative assigning billing codes (0.2%) writing prescriptions less studied. For focused question (84.2%), while summarization (8.9%) conversational dialogue (3.3%) infrequent. Almost all (95.4%) accuracy primary evaluation; fairness, bias, toxicity (15.8%), deployment considerations (4.6%), calibration uncertainty (1.2%) infrequently measured. Finally, specialty area, generic applications (25.6%), internal medicine (16.4%), surgery (11.4%), ophthalmology (6.9%), with nuclear (0.6%), physical (0.4%), genetics being least represented. Conclusions Relevance Existing mostly focus examinations, without consideration data. Dimensions received limited attention. Future should adopt standardized metrics, use clinical data, broaden to include a wider range specialties.

Language: Английский

Citations

Integrating Retrieval-Augmented Generation with Large Language Models in Nephrology: Advancing Practical Applications DOI

Jing Miao, Charat Thongprayoon, Supawadee Suppadungsuk

et al.

Medicina, Journal Year: 2024, Volume and Issue: 60(3), P. 445 - 445

Published: March 8, 2024

The integration of large language models (LLMs) into healthcare, particularly in nephrology, represents a significant advancement applying advanced technology to patient care, medical research, and education. These have progressed from simple text processors tools capable deep understanding, offering innovative ways handle health-related data, thus improving practice efficiency effectiveness. A challenge applications LLMs is their imperfect accuracy and/or tendency produce hallucinations—outputs that are factually incorrect or irrelevant. This issue critical where precision essential, as inaccuracies can undermine the reliability these crucial decision-making processes. To overcome challenges, various strategies been developed. One such strategy prompt engineering, like chain-of-thought approach, which directs towards more accurate responses by breaking down problem intermediate steps reasoning sequences. Another one retrieval-augmented generation (RAG) strategy, helps address hallucinations integrating external enhancing output relevance. Hence, RAG favored for tasks requiring up-to-date, comprehensive information, clinical decision making educational applications. In this article, we showcase creation specialized ChatGPT model integrated with system, tailored align KDIGO 2023 guidelines chronic kidney disease. example demonstrates its potential providing specialized, advice, marking step reliable efficient nephrology practices.

Language: Английский

Citations

Bibliometric Analysis on ChatGPT Research with CiteSpace DOI

Dongyan Nan, Xiangying Zhao, Chaomei Chen

et al.

Information, Journal Year: 2025, Volume and Issue: 16(1), P. 38 - 38

Published: Jan. 9, 2025

ChatGPT is a generative artificial intelligence (AI) based chatbot developed by OpenAI and has attracted great attention since its launch in late 2022. This study aims to provide an overview of research through CiteSpace-based bibliometric analysis. We collected 2465 published articles related from the Web Science. The main forces were identified examining productive researchers, institutions, countries/regions. Moreover, we performed co-authorship network analysis at levels author country/region. Additionally, conducted co-citation identify impactful journals/sources, literature field cluster primary themes this field. key findings are as follows. First, found that most researcher, institution, country Ishith Seth/Himel Mondal, Stanford University, United States, respectively. Second, highly cited researchers Tiffany H. Kung, Tom Brown, Malik Sallam. Third, impactable sources/journals area ARXIV, Nature, Cureus Journal Medical Fourth, work was Kung et al., who demonstrated can potentially support medical education. Fifth, overall author-based collaboration consists several isolated sub-networks, which indicates authors small groups lack communication. Sixth, Kingdom, India, Spain had high degree betweenness centrality, means they play significant roles country/region-based network. Seventh, major “data processing using ChatGPT”, “exploring user behavioral intention “applying for differential diagnosis”. Overall, believe our will help scholars stakeholders understand academic development ChatGPT.

Language: Английский

Citations

A Systematic Review of Testing and Evaluation of Healthcare Applications of Large Language Models (LLMs) DOI

Suhana Bedi, Yutong Liu, Lucy Orr-Ewing

et al.

medRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: April 16, 2024

1 Abstract Importance Large Language Models (LLMs) can assist in a wide range of healthcare-related activities. Current approaches to evaluating LLMs make it difficult identify the most impactful LLM application areas. Objective To summarize current evaluation healthcare terms 5 components: data type, task, Natural Processing (NLP)/Natural Understanding (NLU) dimension evaluation, and medical specialty. Data Sources A systematic search PubMed Web Science was performed for studies published between 01-01-2022 02-19-2024. Study Selection Studies one or more healthcare. Extraction Synthesis Three independent reviewers categorized 519 used tasks (the what) NLP/NLU how) examined, dimension(s) specialty studied. Results Only 5% reviewed utilized real patient care evaluation. The popular were assessing knowledge (e.g. answering licensing exam questions, 44.5%), followed by making diagnoses (19.5%), educating patients (17.7%). Administrative such as assigning provider billing codes (0.2%), writing prescriptions generating clinical referrals (0.6%) notetaking (0.8%) less For tasks, vast majority examined question (84.2%). Other summarization (8.9%), conversational dialogue (3.3%), translation (3.1%) infrequent. Almost all (95.4%) accuracy primary evaluation; fairness, bias toxicity (15.8%), robustness (14.8%), deployment considerations (4.6%), calibration uncertainty (1.2%) infrequently measured. Finally, area, internal medicine (42%), surgery (11.4%) ophthalmology (6.9%), with nuclear (0.6%), physical (0.4%) genetics (0.2%) being least represented. Conclusions Relevance Existing evaluations mostly focused on exams, without consideration data. Dimensions like toxicity, robustness, received limited attention. draw meaningful conclusions improve adoption, future need establish standardized set applications dimensions, perform using from routine care, broaden testing include administrative well multiple specialties. Key Points Question How are large language models currently evaluated? Findings rarely understudied. summarization, dialogue, explored. Accuracy predominant while assessments neglected. Evaluations specialized fields, rare. Meaning remain shallow fragmented. concrete insights their performance, use across broad specialties dimensions

Language: Английский

Citations

Enhancing clinical decision‐making: Optimizing ChatGPT's performance in hypertension care DOI

Jing Miao, Charat Thongprayoon, Tibor Fülöp

et al.

Journal of Clinical Hypertension, Journal Year: 2024, Volume and Issue: 26(5), P. 588 - 593

Published: April 22, 2024

Language: Английский

Citations

Global Trends in Kidney Stone Awareness: A Time Series Analysis from 2004–2023 DOI

Noppawit Aiumtrakul, Charat Thongprayoon, Supawadee Suppadungsuk

et al.

Clinics and Practice, Journal Year: 2024, Volume and Issue: 14(3), P. 915 - 927

Published: May 20, 2024

Despite the prevalence and incidence of kidney stones progressively increasing worldwide, public awareness this condition remains unclear. Understanding trends can assist healthcare professionals policymakers in planning implementing targeted health interventions. This study investigated online search interest "kidney stone" by analyzing Google Trends, focusing on stationarity predicting future trends.

Language: Английский

Citations

Artificial intelligence chatbots for the nutrition management of diabetes and the metabolic syndrome DOI

Farah Naja,

Mandy Taktouk,

Dana Matbouli

et al.

European Journal of Clinical Nutrition, Journal Year: 2024, Volume and Issue: 78(10), P. 887 - 896

Published: July 26, 2024

Language: Английский

Citations

From AI to the Table: A Systematic Review of ChatGPT’s Potential and Performance in Meal Planning and Dietary Recommendations DOI

Peiqi Guo,

Guancheng Liu,

Xiaoling Xiang

et al.

Dietetics, Journal Year: 2025, Volume and Issue: 4(1), P. 7 - 7

Published: Feb. 14, 2025

A balanced diet is crucial for preventing diseases and managing existing health conditions. ChatGPT as garnered attention from researchers, including nutrition scientists dietitians, an innovative tool personalized meal planning dietary recommendations. Objectives: The purpose of this study was to review scientific evidence on ChatGPT’s performance in providing plans generating Methods: This systematic conducted following the PRISMA guidelines. Keyword-based database searches were performed PubMed, Web Science, EBSCO, Embase. Inclusion criteria included (1) empirical studies (2) primary research Results: Twenty-three met inclusion criteria, comprising fourteen validation studies, five comparative four qualitative studies. Most reported that achieved satisfactory accuracy often indistinguishable human dietitians. One even outperformed However, limitations risks, such safety concerns a lack real-world implementation, also identified. Conclusions: shows promise relatively reliable recommendations, offering more accessible cost-effective solutions. Nevertheless, further are needed address its challenges.

Language: Английский

Citations

Can large language models provide accurate and quality information to parents regarding chronic kidney diseases? DOI

Rüya Naz, Okan Akacı, Hakan Erdoğan

et al.

Journal of Evaluation in Clinical Practice, Journal Year: 2024, Volume and Issue: 30(8), P. 1556 - 1564

Published: July 3, 2024

Artificial Intelligence (AI) large language models (LLM) are tools capable of generating human-like text responses to user queries across topics. The use these in various medical contexts is currently being studied. However, the performance and content quality have not been evaluated specific fields.

Language: Английский

Citations

Application Analysis of the Language Model DOI

Bochao Cai

ITM Web of Conferences, Journal Year: 2025, Volume and Issue: 70, P. 04001 - 04001

Published: Jan. 1, 2025

Language models(LM) like Claude3, ChatGPT and Llamas have prominent development recent years. However, with the rapid of theses technology, how to better utilize them avoid potential risks become an important research topic. Therefore, this paper aims investigate these LMs can be utilized serve humans risks. This mainly analyzes use LM in four fields, namely finance, healthcare, entertainment, customer service illustrate usefulness LMs. It is emerging technology considerable various fields help do their jobs faster. But at same time, they may problems about privacy, decision affectivity, ethics, unemployment. essay will hopefully provide a suggestion for specification that come later. These tools need used correctly, reasonable avoidance it poses, more fully strengths

Language: Английский

Citations