The Origins and Veracity of References ‘Cited’ by Generative Artificial Intelligence Applications: Implications for the Quality of Responses DOI Creative Commons
Dirk Spennemann

Publications, Journal Year: 2025, Volume and Issue: 13(1), P. 12 - 12

Published: March 12, 2025

The public release of ChatGPT in late 2022 has resulted considerable publicity and led to widespread discussion the usefulness capabilities generative Artificial intelligence (Ai) language models. Its ability extract summarise data from textual sources present them as human-like contextual responses makes it an eminently suitable tool answer questions users might ask. Expanding on a previous analysis ChatGPT3.5, this paper tested what archaeological literature appears have been included training phase three recent Ai models: ChatGPT4o, ScholarGPT, DeepSeek R1. While ChatGPT3.5 offered seemingly pertinent references, large percentage proved be fictitious. more model which is purportedly tailored towards academic needs, performed much better, still high rate fictitious references compared general models ChatGPT4o DeepSeek. Using ‘cloze’ make inferences ‘memorized’ by model, was unable prove that any four genAi had perused full texts genuine references. It can shown all provided other OpenAi models, well DeepSeek, were found genuine, also cited Wikipedia pages. This strongly indicates source base for at least some, if not most, those pages thus represents, best, third-hand material. significant implications relation quality available shape their answers. are discussed.

Language: Английский

The Origins and Veracity of References ‘Cited’ by Generative Artificial Intelligence Applications: Implications for the Quality of Responses DOI Creative Commons
Dirk Spennemann

Publications, Journal Year: 2025, Volume and Issue: 13(1), P. 12 - 12

Published: March 12, 2025

The public release of ChatGPT in late 2022 has resulted considerable publicity and led to widespread discussion the usefulness capabilities generative Artificial intelligence (Ai) language models. Its ability extract summarise data from textual sources present them as human-like contextual responses makes it an eminently suitable tool answer questions users might ask. Expanding on a previous analysis ChatGPT3.5, this paper tested what archaeological literature appears have been included training phase three recent Ai models: ChatGPT4o, ScholarGPT, DeepSeek R1. While ChatGPT3.5 offered seemingly pertinent references, large percentage proved be fictitious. more model which is purportedly tailored towards academic needs, performed much better, still high rate fictitious references compared general models ChatGPT4o DeepSeek. Using ‘cloze’ make inferences ‘memorized’ by model, was unable prove that any four genAi had perused full texts genuine references. It can shown all provided other OpenAi models, well DeepSeek, were found genuine, also cited Wikipedia pages. This strongly indicates source base for at least some, if not most, those pages thus represents, best, third-hand material. significant implications relation quality available shape their answers. are discussed.

Language: Английский

Citations

0