Опубликована: Июль 27, 2024
Язык: Английский
Опубликована: Июль 27, 2024
Язык: Английский
IEEE Transactions on Learning Technologies, Год журнала: 2024, Номер 17, С. 1920 - 1930
Опубликована: Янв. 1, 2024
Manually scoring and revising student essays has long been a time-consuming task for educators. With the rise of natural language processing techniques, automated essay (AES) (AER) have emerged to alleviate this burden. However, current AES AER models require large amounts training data lack generalizability, which makes them hard implement in daily teaching activities. Moreover, online sites offering services charge high fees security issues uploading content. In light these challenges, recognizing advancements (LLMs), we aim fill research gaps by analyzing performance open-source LLMs when accomplishing tasks. Using human-scored dataset (n = 600) collected an assessment, implemented zero-shot, few-shot, p-tuning methods based on conducted human-machine consistency check. We similarity test score difference results with support. The check result shows that 10B parameter size is close some deep learning baseline models, it can be improved integrating comment into shot or continuous prompts. show effectively accomplish task, improving quality while ensuring revision are similar original essays. This study reveals practical path cost-effectively, time-efficiently, content-safely assisting teachers using LLMs.
Язык: Английский
Процитировано
6The Asia-Pacific Education Researcher, Год журнала: 2024, Номер 33(4), С. 957 - 976
Опубликована: Май 29, 2024
Язык: Английский
Процитировано
3Systems, Год журнала: 2024, Номер 12(9), С. 380 - 380
Опубликована: Сен. 21, 2024
Constructed response items that require the student to give more detailed and elaborate responses are widely applied in large-scale assessments. However, hand-craft scoring with a rubric for massive is labor-intensive impractical due rater subjectivity answer variability. The automatic coding method, such as of short answers, has become critical component learning assessment system. In this paper, we propose an interactive system called ASSIST efficiently score expert knowledge then generate classifier. First, ungraded clustered specific codes, representative responses, indicator words. constraint set based on feedback from experts taken training data metric compensate machine bias. Meanwhile, classifier code trained according clustering results. Second, review each coded cluster words rating. pairs will be validated ensure inter-rater reliability. Finally, available new out-of-distribution detection, which similarity between representation class proxy, i.e., weight last linear layer originality developed stems procedure, involves adaptive can identify classes. proposed evaluated our real-world dataset. results experiments demonstrate effectiveness saving human effort improving performance. average improvements quality accuracy 14.48% 18.94%, respectively. Additionally, reported reliability, rate, statistics, before after interaction.
Язык: Английский
Процитировано
0Опубликована: Июль 27, 2024
Язык: Английский
Процитировано
0