Communications Psychology,
Journal Year:
2024,
Volume and Issue:
2(1)
Published: June 3, 2024
In
the
present
study,
we
investigate
and
compare
reasoning
in
large
language
models
(LLMs)
humans,
using
a
selection
of
cognitive
psychology
tools
traditionally
dedicated
to
study
(bounded)
rationality.
We
presented
human
participants
an
array
pretrained
LLMs
new
variants
classical
experiments,
cross-compared
their
performances.
Our
results
showed
that
most
included
errors
akin
those
frequently
ascribed
error-prone,
heuristic-based
reasoning.
Notwithstanding
this
superficial
similarity,
in-depth
comparison
between
humans
indicated
important
differences
with
human-like
reasoning,
models'
limitations
disappearing
almost
entirely
more
recent
LLMs'
releases.
Moreover,
show
while
it
is
possible
devise
strategies
induce
better
performance,
machines
are
not
equally
responsive
same
prompting
schemes.
conclude
by
discussing
epistemological
implications
challenges
comparing
machine
behavior
for
both
artificial
intelligence
psychology.
Perspectives on Psychological Science,
Journal Year:
2024,
Volume and Issue:
19(5), P. 808 - 826
Published: Jan. 2, 2024
We
illustrate
how
standard
psychometric
inventories
originally
designed
for
assessing
noncognitive
human
traits
can
be
repurposed
as
diagnostic
tools
to
evaluate
analogous
in
large
language
models
(LLMs).
start
from
the
assumption
that
LLMs,
inadvertently
yet
inevitably,
acquire
psychological
(metaphorically
speaking)
vast
text
corpora
on
which
they
are
trained.
Such
contain
sediments
of
personalities,
values,
beliefs,
and
biases
countless
authors
these
texts,
LLMs
learn
through
a
complex
training
process.
The
such
way
potentially
influence
their
behavior,
is,
outputs
downstream
tasks
applications
employed,
turn
may
have
real-world
consequences
individuals
social
groups.
By
eliciting
LLMs’
responses
language-based
inventories,
we
bring
light.
Psychometric
profiling
enables
researchers
study
compare
terms
characteristics,
thereby
providing
window
into
exhibit
(or
mimic).
discuss
history
similar
ideas
outline
possible
approaches
LLMs.
demonstrate
one
promising
approach,
zero-shot
classification,
several
inventories.
conclude
by
highlighting
open
challenges
future
avenues
research
AI
Psychometrics.
Academy of Management Review,
Journal Year:
2024,
Volume and Issue:
unknown
Published: Jan. 5, 2024
The
growing
sophistication
of
artificial
intelligence
(AI)
tools
in
entrepreneurship
is
transforming
how
new
ventures
identify,
gather,
analyze,
and
utilize
information
from
their
internal
external
operating
environments
to
automate
critical
choices,
decisions,
tasks.
For
many
startups
corporate
ventures,
prior
research
suggests
that
AI
provides
significant
task
performance
advantages
entrepreneurs
addressing
the
problem
uncertainty,
part,
through
enhanced
predictive
capabilities.
What
less
clear,
however,
whether
enable
manage
problems
"Knightian
uncertainty"—a
fundamental
type
uncertainty
manifests
a
cascading
set
four
interrelated
problems:
actor
ignorance,
practical
indeterminism,
agentic
novelty,
competitive
recursion.
In
this
study,
we
argue
capabilities
are
contingent
upon
ability
these
systems
grapple
with
Knightian
uncertainty.
We
investigate
logic
approach
an
in-depth
analysis
limits
foundational
emerging
types
address
problems,
identifying
areas
computational
irreducibility
where
manifestation
use
entrepreneurship.
Critical Perspectives on Accounting,
Journal Year:
2024,
Volume and Issue:
99, P. 102722 - 102722
Published: Feb. 22, 2024
New
large
language
models
(LLMs)
like
ChatGPT
have
the
potential
to
change
qualitative
research
by
contributing
every
stage
of
process
from
generating
interview
questions
structuring
publications.
However,
it
is
far
clear
whether
such
'assistance'
will
enable
or
deskill
and
eventually
displace
researcher.
This
paper
sets
out
explore
implications
for
recently
emerged
capabilities
LLMs;
how
they
acquired
their
seemingly
'human-like'
'converse'
with
us
humans,
in
what
ways
these
are
deceptive
misleading.
Building
on
a
comparison
different
'trainings'
humans
LLMs,
first
traces
human-like
qualities
LLM
human
proclivity
project
communicative
intent
into
onto
LLMs'
purely
imitative
capacity
predict
structure
communication.
It
then
goes
detail
which
communication
misleading
relation
absolute
'certainty'
LLMs
'converse',
intrinsic
tendencies
'hallucination'
'sycophancy',
narrow
conception
'artificial
intelligence',
complete
lack
ethical
sensibility
responsibility,
finally
feared
danger
an
'emergence'
'human-competitive'
'superhuman'
capabilities.
The
concludes
noting
dangers
widespread
use
as
'mediators'
self-understanding
culture.
A
postscript
offers
brief
reflection
only
can
do
researchers.
Computational Linguistics,
Journal Year:
2023,
Volume and Issue:
50(1), P. 293 - 350
Published: Nov. 15, 2023
Abstract
Transformer
language
models
have
received
widespread
public
attention,
yet
their
generated
text
is
often
surprising
even
to
NLP
researchers.
In
this
survey,
we
discuss
over
250
recent
studies
of
English
model
behavior
before
task-specific
fine-tuning.
Language
possess
basic
capabilities
in
syntax,
semantics,
pragmatics,
world
knowledge,
and
reasoning,
but
these
are
sensitive
specific
inputs
surface
features.
Despite
dramatic
increases
quality
as
scale
hundreds
billions
parameters,
the
still
prone
unfactual
responses,
commonsense
errors,
memorized
text,
social
biases.
Many
weaknesses
can
be
framed
over-generalizations
or
under-generalizations
learned
patterns
text.
We
synthesize
results
highlight
what
currently
known
about
large
capabilities,
thus
providing
a
resource
for
applied
work
research
adjacent
fields
that
use
models.
Mind & Language,
Journal Year:
2023,
Volume and Issue:
39(2), P. 237 - 259
Published: July 12, 2023
Can
large
language
models
produce
expert‐quality
philosophical
texts?
To
investigate
this,
we
fine‐tuned
GPT‐3
with
the
works
of
philosopher
Daniel
Dennett.
evaluate
model,
asked
real
Dennett
10
questions
and
then
posed
same
to
collecting
four
responses
for
each
question
without
cherry‐picking.
Experts
on
Dennett's
work
succeeded
at
distinguishing
Dennett‐generated
machine‐generated
answers
above
chance
but
substantially
short
our
expectations.
Philosophy
blog
readers
performed
similarly
experts,
while
ordinary
research
participants
were
near
GPT‐3's
from
those
an
“actual
human
philosopher”.
Large
language
models
(LLMs),
like
ChatGPT,
GitHub
Copilot,
and
Microsoft
present
challenges
in
university
education,
particularly
for
paper
assignments.
These
AI-driven
tools
enable
students
to
(semi)automatically
complete
tasks
that
were
previously
considered
evidence
of
skill
acquisition,
potentially
affecting
grading
development.
However,
the
use
these
is
not
legally
plagiarism
becoming
increasingly
integrated
into
various
software
solutions.University
education
social
sciences
aims
develop
students'
abilities
make
sense
world,
connect
their
observations
with
abstract
structures,
measure
phenomena
interest,
systematically
test
expectations,
findings
structured
accounts.
practices
are
learned
through
repeated
performance
tasks,
such
as
writing
research
papers.
LLM
applications
ChatGPT
create
conflicting
incentives
students,
who
might
rely
on
them
produce
parts
papers
instead
engaging
learning
process.While
LLMs
can
be
helpful
knowledge
discovery,
assistance,
coding
using
effectively
safely
requires
an
understanding
underlying
mechanisms,
potential
weaknesses,
enough
domain
identify
mistakes.
This
makes
challenging
early
stages
acquiring
scientific
skills
knowledge.Educators
must
train
responsibly
new
tools,
reflecting
tensions
strengths
weaknesses
academic
tasks.
working
provide
guidelines
responsible
contexts,
specifically
at
Chair
Governance
Complex
Innovative
Technological
Systems
University
Bamberg.
The
discusses
function
written
assignments,
necessary
them,
evaluates
ChatGPT's
assisting
It
concludes
advice
maximize
benefits
while
mitigating
risks
focusing
enabling
learning.
In
recent
years,
the
process
humans
adopt
to
learn
a
foreign
language
has
moved
from
strict
"Grammar
–Translation"
method,
which
is
based
mainly
on
grammar
and
syntax
rules,
more
innovative
processes,
resulting
modern
"Communicative
approach".
As
its
name
states,
this
approach
focuses
coherent
communication
with
native
speakers
cultivation
of
oral
skills,
without
taking
into
consideration,
at
least
first
stages,
rules
that
govern
language.
Proceedings of the National Academy of Sciences,
Journal Year:
2024,
Volume and Issue:
121(41)
Published: Oct. 4, 2024
The
widespread
adoption
of
large
language
models
(LLMs)
makes
it
important
to
recognize
their
strengths
and
limitations.
We
argue
that
develop
a
holistic
understanding
these
systems,
we
must
consider
the
problem
they
were
trained
solve:
next-word
prediction
over
Internet
text.
By
recognizing
pressures
this
task
exerts,
can
make
predictions
about
strategies
LLMs
will
adopt,
allowing
us
reason
when
succeed
or
fail.
Using
approach—which
call
teleological
approach—we
identify
three
factors
hypothesize
influence
LLM
accuracy:
probability
be
performed,
target
output,
provided
input.
To
test
our
predictions,
evaluate
five
(GPT-3.5,
GPT-4,
Claude
3,
Llama
Gemini
1.0)
on
11
tasks,
find
robust
evidence
are
influenced
by
in
hypothesized
ways.
Many
experiments
reveal
surprising
failure
modes.
For
instance,
GPT-4’s
accuracy
at
decoding
simple
cipher
is
51%
output
high-probability
sentence
but
only
13%
low-probability,
even
though
deterministic
one
for
which
should
not
matter.
These
results
show
AI
practitioners
careful
using
low-probability
situations.
More
broadly,
conclude
as
if
humans
instead
treat
them
distinct
type
system—one
has
been
shaped
its
own
particular
set
pressures.
Instructor’s
feedback
plays
a
critical
role
in
students’
development
of
conceptual
understanding
and
reasoning
skills.
However,
grading
student
written
responses
providing
personalized
can
take
substantial
amount
time,
especially
large
enrollment
courses.
In
this
study,
we
explore
using
GPT-3.5
to
write
on
questions
with
prompt
engineering
few-shot
learning
techniques.
stage
I,
used
small
portion
(n=20)
the
one
question
iteratively
train
GPT
generate
feedback.
Four
paired
human-written
were
included
as
examples
for
GPT.
We
tasked
another
16
refined
through
several
iterations.
II,
gave
four
researchers
(one
graduate
three
undergraduate
researchers)
well
two
versions
feedback,
by
authors
other
Students
asked
rate
correctness
usefulness
each
indicate
which
was
generated
The
results
showed
that
students
tended
human
equally
correctness,
but
they
all
rated
more
useful.
Additionally,
success
rates
identifying
GPT’s
low,
ranging
from
0.1
0.6.
III,
rest
(n=65).
messages
instructors
based
extent
modification
needed
if
give
students.
All
approximately
70%
(ranging
68%
78%)
statements
needing
only
minor
or
no
modification.
This
study
demonstrated
feasibility
generative
artificial
intelligence
(AI)
an
assistant
relatively
number
prompt.
An
AI
be
solutions
substantially
reduce
time
spent
responses.
Published
American
Physical
Society
2024