ChatGPT's quality: Reliability and validity of concept inventory items DOI Creative Commons
Stefan Küchemann, Martina A. Rau, Albrecht Schmidt

et al.

Frontiers in Psychology, Journal Year: 2024, Volume and Issue: 15

Published: Oct. 8, 2024

Introduction The recent advances of large language models (LLMs) have opened a wide range opportunities, but at the same time, they pose numerous challenges and questions that research needs to answer. One main are quality correctness output LLMs as well overreliance students on without critically reflecting it. This poses question in educational tasks what teachers need consider when using for creating items. In this work, we focus characteristics conceptual items developed ChatGPT user-generated improvements. Methods For purpose, optimized prompts created 30 kinematics, which is standard topic high-school level physics. were rated by two independent experts. Those 15 received highest rating included survey. dimensions designed align with ones most commonly used concept inventory, Force Concept Inventory (FCI). We administered together FCI 172 first-year university students. results show medium difficulty discriminatory index overall exhibit slightly lower average values FCI. Moreover, confirmatory factor analysis confirmed three model closely aligned previously suggested expert model. Results discussion way, after careful prompt engineering, thorough selection fully automatically generated ChatGPT, able create had only than carefully human-generated procedures select such high-quality set require efforts point towards cognitive demands demonstrate human oversight or student interviews necessary one-dimensional assessments distractors students' difficulties.

Language: Английский

Large language models—Valuable tools that require a sensitive integration into teaching and learning physics DOI
Stefan Küchemann, Steffen Steinert, Jochen Kühn

et al.

The Physics Teacher, Journal Year: 2024, Volume and Issue: 62(5), P. 400 - 402

Published: April 30, 2024

Views Icon Article contents Figures & tables Video Audio Supplementary Data Peer Review Share Twitter Facebook Reddit LinkedIn Tools Reprints and Permissions Cite Search Site Citation Stefan Küchemann, Steffen Steinert, Jochen Kuhn, Karina Avila, Ruzika; Large language models—Valuable tools that require a sensitive integration into teaching learning physics. Phys. Teach. 1 May 2024; 62 (5): 400–402. https://doi.org/10.1119/5.0212374 Download citation file: Ris (Zotero) Reference Manager EasyBib Bookends Mendeley Papers EndNote RefWorks BibTex toolbar search Dropdown Menu input auto suggest filter your All ContentAmerican Association of Physics TeachersThe Teacher Advanced |Citation

Language: Английский

Citations

10

Enhancing students’ critical thinking skills through the use of DALL-E DOI
Stefan Küchemann, Steffen Steinert, Karina E. Avila

et al.

The Physics Teacher, Journal Year: 2025, Volume and Issue: 63(2), P. 138 - 139

Published: Jan. 27, 2025

Language: Английский

Citations

1

Creating physics concept cartoons using ChatGPT DOI
Atakan Çoban, Jochen Kühn, Stefan Küchemann

et al.

The Physics Teacher, Journal Year: 2025, Volume and Issue: 63(3), P. 220 - 221

Published: Feb. 28, 2025

Language: Английский

Citations

1

Foundation Models: From Current Developments, Challenges, and Risks to Future Opportunities DOI
Ali Hussain, Sikandar Ali,

Umm E. Farwa

et al.

Published: Feb. 16, 2025

Language: Английский

Citations

0

Exploring Multimodal Generative AI for Education through Co-design Workshops with Students DOI
Prajish Prasad, Rishabh Balse,

Dhwani Balchandani

et al.

Published: April 25, 2025

Language: Английский

Citations

0

A Systematic Literature Review of Empirical Research on Applying Generative Artificial Intelligence in Education DOI
Xin Zhang, Peng Zhang, Yuan Shen

et al.

Frontiers of digital education., Journal Year: 2024, Volume and Issue: 1(3), P. 223 - 245

Published: Sept. 1, 2024

Language: Английский

Citations

2

Harnessing large language models to develop research-based learning assistants for formative feedback DOI Creative Commons
Steffen Steinert, Karina E. Avila, Stefan Ruzika

et al.

Smart Learning Environments, Journal Year: 2024, Volume and Issue: 11(1)

Published: Dec. 19, 2024

Abstract Effectively supporting students in mastering all facets of self-regulated learning is a central aim teachers and educational researchers. Prior research could demonstrate that formative feedback an effective way to support during learning. In this light, we propose the application Large Language Models (LLMs) guide towards problem-solving through feedback. We present LEAP, novel platform utilizes advanced LLMs, such as GPT-4o. LEAP empowers with ability effectively pre-prompt assign tasks LLM, resulting stimulates students’ cognitive metacognitive processes, thereby enhancing systematic prompt design can provide wide range types scaffolds students. These scaffolds, which are rooted research, include sense-making, elaboration, self-explanation, partial task-solution well motivational scaffolds. Through approach, emphasize critical importance synchronizing technological advances empirical theoretical frameworks. This alignment potentially ensures positive LLMs landscape.

Language: Английский

Citations

2

LMM Spectrometric Determination of an Organic Compound DOI Creative Commons

Kevin Kawchak

Published: Aug. 28, 2024

Many machine learning models used in academia and industry that identify organic compounds typically lack the ability to converse over prompts results, also require expertise across a number of steps obtain answers. The purpose this study was primarily gain insight into advantages current unmodified state art Large Multimodal Models (LMMs) several containing multiple spectra varying difficulty evaluate impact training data, reasoning, speed. These readily available easy use software for identification an compound based on molecular formula were found be reproducible three similar LMMs. To author's best knowledge, marks first time GPT variants each able correctly quinoline using variety different spectroscopic images. results obtained 2-step process consisting a) Uploading high resolution spectral images, b) Submitting text prompt with images requested determination. main findings 1) Four LMMs provided rationale step-by-step interpretations 1H-NMR, 13C-NMR, 3 DEPT-NMR from Prompt A, 2) Three these LMMs, led by GPT-5 preview model, combined correct chemical structure 3) Two achieved top score 5/5 generating sequential explanations reflecting order along most explanations.

Language: Английский

Citations

1

ChatGPT's quality: Reliability and validity of concept inventory items DOI Creative Commons
Stefan Küchemann, Martina A. Rau, Albrecht Schmidt

et al.

Frontiers in Psychology, Journal Year: 2024, Volume and Issue: 15

Published: Oct. 8, 2024

Introduction The recent advances of large language models (LLMs) have opened a wide range opportunities, but at the same time, they pose numerous challenges and questions that research needs to answer. One main are quality correctness output LLMs as well overreliance students on without critically reflecting it. This poses question in educational tasks what teachers need consider when using for creating items. In this work, we focus characteristics conceptual items developed ChatGPT user-generated improvements. Methods For purpose, optimized prompts created 30 kinematics, which is standard topic high-school level physics. were rated by two independent experts. Those 15 received highest rating included survey. dimensions designed align with ones most commonly used concept inventory, Force Concept Inventory (FCI). We administered together FCI 172 first-year university students. results show medium difficulty discriminatory index overall exhibit slightly lower average values FCI. Moreover, confirmatory factor analysis confirmed three model closely aligned previously suggested expert model. Results discussion way, after careful prompt engineering, thorough selection fully automatically generated ChatGPT, able create had only than carefully human-generated procedures select such high-quality set require efforts point towards cognitive demands demonstrate human oversight or student interviews necessary one-dimensional assessments distractors students' difficulties.

Language: Английский

Citations

0