Studying and improving reasoning in humans and machines DOI Creative Commons
Nicolas Yax, Hernán Anlló, Stefano Palminteri

et al.

Communications Psychology, Journal Year: 2024, Volume and Issue: 2(1)

Published: June 3, 2024

In the present study, we investigate and compare reasoning in large language models (LLMs) humans, using a selection of cognitive psychology tools traditionally dedicated to study (bounded) rationality. We presented human participants an array pretrained LLMs new variants classical experiments, cross-compared their performances. Our results showed that most included errors akin those frequently ascribed error-prone, heuristic-based reasoning. Notwithstanding this superficial similarity, in-depth comparison between humans indicated important differences with human-like reasoning, models' limitations disappearing almost entirely more recent LLMs' releases. Moreover, show while it is possible devise strategies induce better performance, machines are not equally responsive same prompting schemes. conclude by discussing epistemological implications challenges comparing machine behavior for both artificial intelligence psychology.

Language: Английский

Natural language processing in the era of large language models DOI Creative Commons
Arkaitz Zubiaga

Frontiers in Artificial Intelligence, Journal Year: 2024, Volume and Issue: 6

Published: Jan. 12, 2024

SPECIALTY GRAND CHALLENGE article Front. Artif. Intell., 12 January 2024Sec. Natural Language Processing Volume 6 - 2023 | https://doi.org/10.3389/frai.2023.1350306

Language: Английский

Citations

11

Large Language Models in Oncology: Revolution or Cause for Concern? DOI Creative Commons

Aydin Caglayan,

Wojciech Slusarczyk, Rukhshana Dina Rabbani

et al.

Current Oncology, Journal Year: 2024, Volume and Issue: 31(4), P. 1817 - 1830

Published: March 29, 2024

The technological capability of artificial intelligence (AI) continues to advance with great strength. Recently, the release large language models has taken world by storm concurrent excitement and concern. As a consequence their impressive ability versatility, provide potential opportunity for implementation in oncology. Areas possible application include supporting clinical decision making, education, contributing cancer research. Despite promises that these novel systems can offer, several limitations barriers challenge implementation. It is imperative concerns, such as accountability, data inaccuracy, protection, are addressed prior integration progression continues, new ethical practical dilemmas will also be approached; thus, evaluation concerns dynamic nature. This review offers comprehensive overview oncology, well surrounding care.

Language: Английский

Citations

11

A tutorial on open-source large language models for behavioral science DOI Creative Commons
Z. Hussain, Marcel Binz, Rui Mata

et al.

Behavior Research Methods, Journal Year: 2024, Volume and Issue: 56(8), P. 8214 - 8237

Published: Aug. 15, 2024

Large language models (LLMs) have the potential to revolutionize behavioral science by accelerating and improving research cycle, from conceptualization data analysis. Unlike closed-source solutions, open-source frameworks for LLMs can enable transparency, reproducibility, adherence protection standards, which gives them a crucial advantage use in science. To help researchers harness promise of LLMs, this tutorial offers primer on Hugging Face ecosystem demonstrates several applications that advance conceptual empirical work science, including feature extraction, fine-tuning prediction, generation responses. Executable code is made available at github.com/Zak-Hussain/LLM4BeSci.git . Finally, discusses challenges faced with (open-source) related interpretability safety perspective future intersection modeling

Language: Английский

Citations

11

The performance of large language models on quantitative and verbal ability tests: Initial evidence and implications for unproctored high‐stakes testing DOI Creative Commons
Louis Hickman, Patrick D. Dunlop, Jasper Leo Wolf

et al.

International Journal of Selection and Assessment, Journal Year: 2024, Volume and Issue: 32(4), P. 499 - 511

Published: May 17, 2024

Abstract Unproctored assessments are widely used in pre‐employment assessment. However, accessible large language models (LLMs) pose challenges for unproctored personnel assessments, given that applicants may use them to artificially inflate their scores beyond true abilities. This be particularly concerning cognitive ability tests, which and traditionally considered less fakeable by humans than personality tests. Thus, this study compares the performance of LLMs on two common types tests: quantitative (number series completion) verbal (use a passage text determine whether statement is true). The tests investigated real‐world, high‐stakes selection. We also examine across different test formats (i.e., open‐ended vs. multiple choice). Further, we contrast (Generative Pretrained Transformers, GPT‐3.5 GPT‐4) prompt approaches “temperature” settings parameter determines amount randomness model's output). found performed well but extremely poorly test, even when accounting format. GPT‐4 outperformed both Notably, although temperature did affect LLM performance, those effects were mostly minor relative differences models. provide recommendations securing testing against influences. Additionally, call rigorous research investigating prevalence usage as how affects selection validity.

Language: Английский

Citations

9

Studying and improving reasoning in humans and machines DOI Creative Commons
Nicolas Yax, Hernán Anlló, Stefano Palminteri

et al.

Communications Psychology, Journal Year: 2024, Volume and Issue: 2(1)

Published: June 3, 2024

In the present study, we investigate and compare reasoning in large language models (LLMs) humans, using a selection of cognitive psychology tools traditionally dedicated to study (bounded) rationality. We presented human participants an array pretrained LLMs new variants classical experiments, cross-compared their performances. Our results showed that most included errors akin those frequently ascribed error-prone, heuristic-based reasoning. Notwithstanding this superficial similarity, in-depth comparison between humans indicated important differences with human-like reasoning, models' limitations disappearing almost entirely more recent LLMs' releases. Moreover, show while it is possible devise strategies induce better performance, machines are not equally responsive same prompting schemes. conclude by discussing epistemological implications challenges comparing machine behavior for both artificial intelligence psychology.

Language: Английский

Citations

9