This
study
investigates
the
capacity
of
Large
Language
Models
(LLMs)
to
infer
Big
Five
personality
traits
from
free-form
user
interactions.
The
results
demonstrate
that
a
chatbot
powered
by
GPT-4
can
with
moderate
accuracy,
outperforming
previous
approaches
drawing
inferences
static
text
content.
accuracy
varied
across
different
conversational
settings.
Performance
was
highest
when
prompted
elicit
personality-relevant
information
users
(mean
r=.443,
range=[.245,
.640]),
followed
condition
placing
greater
emphasis
on
naturalistic
interaction
r=.218,
range=[.066,
.373]).
Notably,
direct
focus
assessment
did
not
result
in
less
positive
experience,
participants
reporting
interactions
be
equally
natural,
pleasant,
engaging,
and
humanlike
both
conditions.
A
mimicking
ChatGPT’s
default
behavior
acting
as
helpful
assistant
led
markedly
inferior
lower
experience
ratings
but
still
captured
psychologically
meaningful
for
some
r=.117,
range=[-.004,
.209]).
Preliminary
analyses
suggest
varies
only
marginally
socio-demographic
subgroups.
Our
highlight
potential
LLMs
psychological
profiling
based
We
discuss
practical
implications
ethical
challenges
associated
these
findings.
Frontiers of Computer Science,
Journal Year:
2024,
Volume and Issue:
18(6)
Published: March 22, 2024
Abstract
Autonomous
agents
have
long
been
a
research
focus
in
academic
and
industry
communities.
Previous
often
focuses
on
training
with
limited
knowledge
within
isolated
environments,
which
diverges
significantly
from
human
learning
processes,
makes
the
hard
to
achieve
human-like
decisions.
Recently,
through
acquisition
of
vast
amounts
Web
knowledge,
large
language
models
(LLMs)
shown
potential
human-level
intelligence,
leading
surge
LLM-based
autonomous
agents.
In
this
paper,
we
present
comprehensive
survey
these
studies,
delivering
systematic
review
holistic
perspective.
We
first
discuss
construction
agents,
proposing
unified
framework
that
encompasses
much
previous
work.
Then,
overview
diverse
applications
social
science,
natural
engineering.
Finally,
delve
into
evaluation
strategies
commonly
used
for
Based
also
several
challenges
future
directions
field.
JMIR Mental Health,
Journal Year:
2024,
Volume and Issue:
11, P. e55988 - e55988
Published: March 8, 2024
Large
language
models
(LLMs)
hold
potential
for
mental
health
applications.
However,
their
opaque
alignment
processes
may
embed
biases
that
shape
problematic
perspectives.
Evaluating
the
values
embedded
within
LLMs
guide
decision-making
have
ethical
importance.
Schwartz's
theory
of
basic
(STBV)
provides
a
framework
quantifying
cultural
value
orientations
and
has
shown
utility
examining
in
contexts,
including
cultural,
diagnostic,
therapist-client
dynamics.
Personality and Individual Differences,
Journal Year:
2024,
Volume and Issue:
228, P. 112729 - 112729
Published: June 3, 2024
Personality
research
has
traditionally
relied
on
questionnaires,
which
bring
with
them
inherent
limitations,
such
as
response
style
bias.
With
the
emergence
of
large
language
models
ChatGPT,
question
arises
to
what
extent
these
can
be
used
in
personality
research.
In
this
study,
ChatGPT
(GPT-4)
generated
2000
text-based
personas.
Next,
for
each
persona,
completed
a
short
form
Big
Five
Inventory
(BFI-10),
Brief
Sensation
Seeking
Scale
(BSSS),
and
Short
Dark
Triad
(SD3).
The
mean
scores
BFI-10
items
were
found
correlate
strongly
means
from
previously
published
research,
principal
component
analysis
revealed
clear
five-component
structure.
Certain
relationships
between
traits,
negative
correlation
age
persona
BSSS
score,
clearly
interpretable,
while
some
other
correlations
diverged
literature.
An
additional
using
four
new
sets
personas
each,
including
set
'realistic'
cinematic
personas,
showed
that
matrix
among
constructs
was
affected
by
set.
It
is
concluded
evaluating
questionnaires
hypotheses
prior
engaging
real
individuals
holds
promise.
Vicinagearth.,
Journal Year:
2024,
Volume and Issue:
1(1)
Published: Oct. 8, 2024
Abstract
The
pursuit
of
more
intelligent
and
credible
autonomous
systems,
akin
to
human
society,
has
been
a
long-standing
endeavor
for
humans.
Leveraging
the
exceptional
reasoning
planning
capabilities
large
language
models
(LLMs),
LLM-based
agents
have
proposed
achieved
remarkable
success
across
wide
array
tasks.
Notably,
multi-agent
systems
(MAS)
are
considered
promising
pathway
towards
realizing
general
artificial
intelligence
that
is
equivalent
or
surpasses
human-level
intelligence.
In
this
paper,
we
present
comprehensive
survey
these
studies,
offering
systematic
review
MAS.
Adhering
workflow
synthesize
structure
encompassing
five
key
components:
profile,
perception,
self-action,
mutual
interaction,
evolution.
This
unified
framework
encapsulates
much
previous
work
in
field.
Furthermore,
illuminate
extensive
applications
MAS
two
principal
areas:
problem-solving
world
simulation.
Finally,
discuss
detail
several
contemporary
challenges
provide
insights
into
potential
future
directions
domain.
We
identify
some
of
the
most
pressing
issues
facing
adoption
large
language
models
(LLMs)
in
practical
settings
and
propose
a
research
agenda
to
reach
next
technological
inflection
point
generative
AI.
focus
on
three
challenges
LLM
applications:
domain-specific
expertise
ability
tailor
that
user's
unique
situation,
trustworthiness
adherence
moral
ethical
standards,
conformity
regulatory
guidelines
oversight.
Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining,
Journal Year:
2024,
Volume and Issue:
unknown, P. 6437 - 6447
Published: Aug. 24, 2024
With
the
rapid
advancements
of
large
language
models
(LLMs),
information
retrieval
(IR)
systems,
such
as
search
engines
and
recommender
have
undergone
a
significant
paradigm
shift.
This
evolution,
while
heralding
new
opportunities,
introduces
emerging
challenges,
particularly
in
terms
biases
unfairness,
which
may
threaten
ecosystem.
In
this
paper,
we
present
comprehensive
survey
existing
works
on
pressing
bias
unfairness
issues
IR
systems
when
integration
LLMs.
We
first
unify
distribution
mismatch
problems,
providing
groundwork
for
categorizing
various
mitigation
strategies
through
alignment.
Subsequently,
systematically
delve
into
specific
arising
from
three
critical
stages
LLMs
systems:
data
collection,
model
development,
result
evaluation.
doing
so,
meticulously
review
analyze
recent
literature,
focusing
definitions,
characteristics,
corresponding
associated
with
these
issues.
Finally,
identify
highlight
some
open
problems
challenges
future
work,
aiming
to
inspire
researchers
stakeholders
field
beyond
better
understand
mitigate
LLM
era.
also
consistently
maintain
GitHub
repository
relevant
papers
resources
rising
direction
at
https://github.com/KID-22/LLM-IR-Bias-Fairness-Survey.
Scientific Reports,
Journal Year:
2024,
Volume and Issue:
14(1)
Published: Nov. 10, 2024
Abstract
Large
language
models
(LLM)
have
been
a
catalyst
for
the
public
interest
in
artificial
intelligence
(AI).
These
technologies
perform
some
knowledge-based
tasks
better
and
faster
than
human
beings.
However,
whether
AIs
can
correctly
assess
social
situations
devise
socially
appropriate
behavior,
is
still
unclear.
We
conducted
an
established
Situational
Judgment
Test
(SJT)
with
five
different
chatbots
compared
their
results
responses
of
participants
(
N
=
276).
Claude,
Copilot
you.com’s
smart
assistant
performed
significantly
humans
proposing
suitable
behaviors
situations.
Moreover,
effectiveness
rating
behavior
options
aligned
well
expert
ratings.
indicate
that
LLMs
are
capable
producing
adept
judgments.
While
this
constitutes
important
requirement
use
as
virtual
assistants,
challenges
risks
associated
wide-spread
contexts.
Computers in Human Behavior Artificial Humans,
Journal Year:
2024,
Volume and Issue:
2(2), P. 100072 - 100072
Published: June 7, 2024
When
searching
and
browsing
the
web,
more
of
information
we
encounter
is
generated
or
mediated
through
large
language
models
(LLMs).
This
can
be
looking
for
a
recipe,
getting
help
on
an
essay,
relationship
advice.
Yet,
there
limited
understanding
how
individuals
perceive
advice
provided
by
these
LLMs.
In
this
paper,
explore
people's
perception
LLM-generated
advice,
what
role
diverse
user
characteristics
(i.e.,
personality
technology
readiness)
play
in
shaping
their
perception.
Further,
as
difficult
to
distinguish
from
human
assess
perceived
creepiness
such
To
investigate
this,
run
exploratory
study
(N
=
91),
where
participants
rate
different
styles
(generated
GPT-3.5
Turbo).
Notably,
our
findings
suggest
that
who
identify
agreeable
tend
like
find
it
useful.
with
higher
technological
insecurity
are
likely
follow
useful,
deem
friend
could
have
given
Lastly,
see
'skeptical'
style
was
rated
most
unpredictable,
'whimsical'
least
malicious—indicating
LLM
influence
perceptions.
Our
results
also
provide
overview
considerations
likelihood,
receptiveness,
they
seek
digital
assistants.
Based
results,
design
takeaways
outline
future
research
directions
further
inform
support
applications
targeting
people
expectations
needs.