Accuracy and reliability of large language models in assessing learning outcomes achievement across cognitive domains DOI Open Access

Swapna Haresh Teckwani,

Amanda Huee‐Ping Wong, W. A. N. V. Luke

и другие.

AJP Advances in Physiology Education, Год журнала: 2024, Номер 48(4), С. 904 - 914

Опубликована: Ноя. 8, 2024

The advent of artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT and Gemini, has significantly impacted the educational landscape, offering unique opportunities for learning assessment. In realm written assessment grading, traditionally viewed as a laborious subjective process, this study sought to evaluate accuracy reliability these LLMs in evaluating achievement outcomes across different cognitive domains scientific inquiry course on sports physiology. Human graders three LLMs, GPT-3.5, GPT-4o, were tasked with scoring submitted student assignments according set rubrics aligned various domains, namely "Understand," "Analyze," "Evaluate" from revised Bloom's taxonomy "Scientific Inquiry Competency." Our findings revealed that while demonstrated some level competency, they do not yet meet standards human graders. Specifically, interrater (percentage agreement correlation analysis) between was superior compared two grading rounds each LLM, respectively. Furthermore, concordance LLM mostly moderate poor terms overall scores pre-specified domains. results suggest future where AI could complement expertise but underscore importance adaptive by educators continuous improvement current technologies fully realize potential.

Язык: Английский

Google Gemini as a next generation AI educational tool: a review of emerging educational technology DOI Creative Commons
Muhammad Imran, Norah Almusharraf

Smart Learning Environments, Год журнала: 2024, Номер 11(1)

Опубликована: Май 23, 2024

Abstract This emerging technology report discusses Google Gemini as a multimodal generative AI tool and presents its revolutionary potential for future educational technology. It introduces features, including versatility in processing data from text, image, audio, video inputs generating diverse content types. study recent empirical studies, practice, the relationship between landscape. further explores Gemini’s relevance endeavors practical applications technologies. Also, it significant challenges ethical considerations that must be addressed to ensure responsible effective integration into

Язык: Английский

Процитировано

25

Exploring the Prospects and Perils of Integrating Artificial Intelligence and ChatGPT in Academic and Research Libraries (ARL): Challenges and Opportunity DOI
Satveer Singh Nehra, Sadanand Y. Bansode

Journal of Web Librarianship, Год журнала: 2024, Номер unknown, С. 1 - 22

Опубликована: Авг. 13, 2024

An attempt is made to find out the scope of AI and AI-powered Chatbot (ChatGPT) in academic research libraries, its possible challenges opportunities, how it makes a difference. In this study, we found that chatbots have potential revolutionize libraries by promoting specialized librarianship reshaping services thinking outside box, enabling search retrieval information based on personal recommendations. These advancements might raise standards offer. However, include user dependability AI, which may affect users' reading habits, lack skilled staff underdeveloped nations, need for high-quality data. ChatGPT also faces biases, ethical implications, an inability understand tone or context, leading misunderstandings poor communication. To effectively utilize as library customer tool, must manage their data, monitor ChatGPT's responses, consider limitations.

Язык: Английский

Процитировано

8

Accuracy and reliability of large language models in assessing learning outcomes achievement across cognitive domains DOI Open Access

Swapna Haresh Teckwani,

Amanda Huee‐Ping Wong, W. A. N. V. Luke

и другие.

AJP Advances in Physiology Education, Год журнала: 2024, Номер 48(4), С. 904 - 914

Опубликована: Ноя. 8, 2024

The advent of artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT and Gemini, has significantly impacted the educational landscape, offering unique opportunities for learning assessment. In realm written assessment grading, traditionally viewed as a laborious subjective process, this study sought to evaluate accuracy reliability these LLMs in evaluating achievement outcomes across different cognitive domains scientific inquiry course on sports physiology. Human graders three LLMs, GPT-3.5, GPT-4o, were tasked with scoring submitted student assignments according set rubrics aligned various domains, namely "Understand," "Analyze," "Evaluate" from revised Bloom's taxonomy "Scientific Inquiry Competency." Our findings revealed that while demonstrated some level competency, they do not yet meet standards human graders. Specifically, interrater (percentage agreement correlation analysis) between was superior compared two grading rounds each LLM, respectively. Furthermore, concordance LLM mostly moderate poor terms overall scores pre-specified domains. results suggest future where AI could complement expertise but underscore importance adaptive by educators continuous improvement current technologies fully realize potential.

Язык: Английский

Процитировано

1