Evaluating AI Excellence: A Comparative Analysis of Generative Models in Library and Information Science DOI

Raiyan Bin Reza,

Md. Rifat Mahmud, S.M. Zabed Ahmed

et al.

Science & Technology Libraries, Journal Year: 2024, Volume and Issue: unknown, P. 1 - 14

Published: Oct. 7, 2024

This study compares the performance of GPT-3.5, GPT-4, Bard, and Gemini in answering Library Information Science (LIS) questions. Sixteen questions were used for assessment, with two independent examiners scoring initial successive responses from each AI system. Statistical analyses, including one-way Analysis Variance (ANOVA), sample t-test, one-sample employed to identify differences. The results revealed consistency generated across iterations all systems. Significant differences observed among models, Bard consistently underperforming compared Gemini. uncovered variability examiners' emphasized need multiple evaluators assessment.

Language: Английский

A multinational study on the factors influencing university students’ attitudes and usage of ChatGPT DOI Creative Commons
Maram Abdaljaleel, Muna Barakat, Mariam Alsanafi

et al.

Scientific Reports, Journal Year: 2024, Volume and Issue: 14(1)

Published: Jan. 23, 2024

Abstract Artificial intelligence models, like ChatGPT, have the potential to revolutionize higher education when implemented properly. This study aimed investigate factors influencing university students’ attitudes and usage of ChatGPT in Arab countries. The survey instrument “TAME-ChatGPT” was administered 2240 participants from Iraq, Kuwait, Egypt, Lebanon, Jordan. Of those, 46.8% heard 52.6% used it before study. results indicated that a positive attitude were determined by ease use, towards technology, social influence, perceived usefulness, behavioral/cognitive influences, low risks, anxiety. Confirmatory factor analysis adequacy constructs. Multivariate demonstrated significantly influenced country residence, age, type, recent academic performance. validated as useful tool for assessing adoption among students. successful integration relies on elements, anxiety, minimal risks. Policies should be tailored individual contexts, considering variations student observed this

Language: Английский

Citations

99

Prompt Engineering Paradigms for Medical Applications: Scoping Review DOI Creative Commons
Jamil Zaghir, Marco Naguib, Mina Bjelogrlic

et al.

Journal of Medical Internet Research, Journal Year: 2024, Volume and Issue: 26, P. e60501 - e60501

Published: Sept. 10, 2024

Prompt engineering, focusing on crafting effective prompts to large language models (LLMs), has garnered attention for its capabilities at harnessing the potential of LLMs. This is even more crucial in medical domain due specialized terminology and technicity. Clinical natural processing applications must navigate complex ensure privacy compliance. engineering offers a novel approach by designing tailored guide exploiting clinically relevant information from texts. Despite promise, efficacy prompt remains be fully explored.

Language: Английский

Citations

10

Knowledge graph validation by integrating LLMs and human-in-the-loop DOI Creative Commons
Stefani Tsaneva, Danilo Dessı̀, Francesco Osborne

et al.

Information Processing & Management, Journal Year: 2025, Volume and Issue: 62(5), P. 104145 - 104145

Published: April 9, 2025

Language: Английский

Citations

0

Prompt Engineering Paradigms for Medical Applications: Scoping Review (Preprint) DOI
Jamil Zaghir, Marco Naguib, Mina Bjelogrlic

et al.

Published: May 14, 2024

BACKGROUND Prompt engineering, focusing on crafting effective prompts to large language models (LLMs), has garnered attention for its capabilities at harnessing the potential of LLMs. This is even more crucial in medical domain due specialized terminology and technicity. Clinical natural processing applications must navigate complex ensure privacy compliance. engineering offers a novel approach by designing tailored guide exploiting clinically relevant information from texts. Despite promise, efficacy prompt remains be fully explored. OBJECTIVE The aim study review research efforts technical approaches as well provide an overview opportunities challenges clinical practice. METHODS Databases indexing fields medicine, computer science, informatics were queried order identify published papers. Since emerging field, preprint databases also considered. Multiple data extracted, such paradigm, involved LLMs, languages study, topic, baselines, several learning, design, architecture strategies specific engineering. We include studies that apply engineering–based methods domain, between 2022 2024, covering multiple paradigms learning (PL), tuning (PT), design (PD). RESULTS included 114 recent studies. Among 3 paradigms, we have observed PD most prevalent (78 papers). In 12 papers, PD, PL, PT terms used interchangeably. While ChatGPT commonly LLM, identified 7 using this LLM sensitive set. Chain-of-thought, present 17 studies, emerges frequent technique. PL papers typically baseline evaluating prompt-based approaches, 61% (48/78) do not report any nonprompt-related baseline. Finally, individually examine each key engineering–specific reported across find many neglect explicitly mention them, posing challenge advancing research. CONCLUSIONS addition reporting trends scientific landscape guidelines future help advance field. disclose tables figures summarizing available hope contributions will leverage these existing works better

Language: Английский

Citations

2

Evaluating AI Excellence: A Comparative Analysis of Generative Models in Library and Information Science DOI

Raiyan Bin Reza,

Md. Rifat Mahmud, S.M. Zabed Ahmed

et al.

Science & Technology Libraries, Journal Year: 2024, Volume and Issue: unknown, P. 1 - 14

Published: Oct. 7, 2024

This study compares the performance of GPT-3.5, GPT-4, Bard, and Gemini in answering Library Information Science (LIS) questions. Sixteen questions were used for assessment, with two independent examiners scoring initial successive responses from each AI system. Statistical analyses, including one-way Analysis Variance (ANOVA), sample t-test, one-sample employed to identify differences. The results revealed consistency generated across iterations all systems. Significant differences observed among models, Bard consistently underperforming compared Gemini. uncovered variability examiners' emphasized need multiple evaluators assessment.

Language: Английский

Citations

0