Designing Incremental Knowledge Enrichment in Generative Pre-trained Transformers
Research Square (Research Square),
Journal Year:
2024,
Volume and Issue:
unknown
Published: April 1, 2024
Abstract
This
article
presents
a
novel
approach
to
Incremental
Knowledge
Enrichment
tailored
for
GPT-Neo,
addressing
the
challenge
of
keeping
Large
Language
Models
(LLMs)
updated
with
latest
information
without
undergoing
comprehensive
retraining.
We
introduce
dynamic
linking
mechanism
that
enables
real-time
integration
diverse
data
sources,
thereby
enhancing
model's
accuracy,
timeliness,
and
relevance.
Through
rigorous
evaluation,
our
method
demonstrates
significant
improvements
in
model
performance
across
several
metrics.
The
research
contributes
scalable
efficient
solution
one
most
pressing
issues
AI,
potentially
revolutionizing
maintenance
applicability
LLMs.
findings
underscore
feasibility
creating
more
adaptive,
responsive,
sustainable
generative
models,
opening
new
avenues
future
advancements
field.
Language: Английский
All Your Base Are Belong to Us: The Urgent Reality of Unproctored Testing in the Age of LLMs
International Journal of Selection and Assessment,
Journal Year:
2025,
Volume and Issue:
33(2)
Published: March 4, 2025
ABSTRACT
The
release
of
new
generative
artificial
intelligence
(AI)
tools,
including
large
language
models
(LLMs),
continues
at
a
rapid
pace.
Upon
the
OpenAI's
o1
models,
I
reconducted
Hickman
et
al.'s
(2024)
analyses
examining
how
well
LLMs
perform
on
quantitative
ability
(number
series)
test.
GPT‐4
scored
below
20th
percentile
(compared
to
thousands
human
test
takers),
but
95th
percentile.
In
response
these
updated
findings
and
Lievens
Dunlop's
(2025)
article
about
effects
validity
pre‐employment
assessments,
make
an
urgent
call
action
for
selection
assessment
researchers
practitioners.
A
recent
survey
suggests
that
proportion
applicants
are
already
using
AI
tools
complete
high‐stakes
it
seems
no
current
assessments
will
be
safe
long.
Thus,
offer
possibilities
future
testing,
detail
their
benefits
drawbacks,
provide
recommendations.
These
are:
increased
use
proctoring,
adding
strict
time
limits,
LLM
detection
software,
think‐aloud
(or
similar)
protocols,
collecting
analyzing
trace
data,
emphasizing
samples
over
signs,
redesigning
allow
during
completion.
Several
inspire
research
modernize
assessment.
Future
should
seek
improve
our
understanding
design
valid
use,
effectively
test‐taker
whether
protocols
can
help
differentiate
experts
novices.
Language: Английский
AI can outperform humans in predicting correlations between personality items
Philipp Schoenegger,
No information about this author
Spencer Greenberg,
No information about this author
Alexander Grishin
No information about this author
et al.
Communications Psychology,
Journal Year:
2025,
Volume and Issue:
3(1)
Published: Feb. 12, 2025
Abstract
We
assess
the
abilities
of
both
specialized
deep
neural
networks,
such
as
PersonalityMap,
and
general
LLMs,
including
GPT-4o
Claude
3
Opus,
in
understanding
human
personality
by
predicting
correlations
between
questionnaire
items.
All
AI
models
outperform
vast
majority
laypeople
academic
experts.
However,
we
can
improve
accuracy
individual
correlation
predictions
taking
median
prediction
per
group
to
produce
a
“wisdom
crowds”
estimate.
Thus,
also
compare
from
laypeople,
experts,
GPT-4o/Claude
PersonalityMap.
Based
on
medians,
PersonalityMap
experts
surpass
LLMs
most
measures.
These
results
suggest
that
while
advanced
make
superior
compared
humans,
like
match
even
expert
group-level
performance
domain-specific
tasks.
This
underscores
capabilities
large
language
emphasizing
continued
relevance
systems
well
for
research.
Language: Английский
Large Language Models Can Infer Personality from Free-Form User Interactions
Published: May 20, 2024
This
study
investigates
the
capacity
of
Large
Language
Models
(LLMs)
to
infer
Big
Five
personality
traits
from
free-form
user
interactions.
The
results
demonstrate
that
a
chatbot
powered
by
GPT-4
can
with
moderate
accuracy,
outperforming
previous
approaches
drawing
inferences
static
text
content.
accuracy
varied
across
different
conversational
settings.
Performance
was
highest
when
prompted
elicit
personality-relevant
information
users
(mean
r=.443,
range=[.245,
.640]),
followed
condition
placing
greater
emphasis
on
naturalistic
interaction
r=.218,
range=[.066,
.373]).
Notably,
direct
focus
assessment
did
not
result
in
less
positive
experience,
participants
reporting
interactions
be
equally
natural,
pleasant,
engaging,
and
humanlike
both
conditions.
A
mimicking
ChatGPT’s
default
behavior
acting
as
helpful
assistant
led
markedly
inferior
lower
experience
ratings
but
still
captured
psychologically
meaningful
for
some
r=.117,
range=[-.004,
.209]).
Preliminary
analyses
suggest
varies
only
marginally
socio-demographic
subgroups.
Our
highlight
potential
LLMs
psychological
profiling
based
We
discuss
practical
implications
ethical
challenges
associated
these
findings.
Language: Английский
Multimodal personality assessment from audio, visual, and verbal features
Antonios Koutsoumpis
No information about this author
Published: Aug. 23, 2024
The
main
theme
of
the
present
dissertation
was
measurement
personality
traits
through
someone’s
verbal
and
non-verbal
behavior.
In
most
studies,
were
measured
using
HEXACO
model
personality,
a
theoretical
framework
–
based
on
cross-cultural
lexical
research
that
organizes
six
factors:
Honesty-Humility,
Emotionality,
Extraversion,
Agreeableness,
Conscientiousness,
Openness
to
Experience.
one
studies
Big
Five
Model
used,
which
contains
similar
factors
except
for
Honesty-Humility.
Behavior
three
modalities:
(a)
audio,
including
voice
characteristics,
such
as
intensity
or
pitch,
(b)
visual,
facial
expressions
head
movements,
(c)
verbal,
written
spoken
text.
All
modalities
automatically
extracted
software
developed
measure
types
features
at
granular
level.
Below
are
presented
findings
across
four
empirical
chapters
dissertation.
have
significant
implications
practitioners
psychologists,
alike.
Regarding
(e.g.,
AVI
vendors),
results
suggest
content
job
interview
questions
should
be
carefully
designed
activate
someone
is
interested
in
measuring.
more
aligns
with
constructs
to-be-measured
traits),
behaviors
exhibited
those
will
correlate
interest.
Furthermore,
even
though
algorithm
Chapter
4
relatively
free
biases,
some
biases
did
emerge
existing
gender
differences
sometimes
further
exacerbated).
As
result,
might
consider
applying
bias
mitigation
techniques
when
employing
AVIs
selection
contexts,
reduce
overall
performance
machine
learning
models.
these
inferences
mainly
driven
by
instead
behaviors.
kernel
truth
text-based
assessment
highlighted
linguistic
contribute
accurate
assessment.
Finally,
showed
asymmetry
explained
variance
between
self-
observer
reports
accounted
level
contextualization
assessment,
suggested
bandwidth-fidelity
dilemma.
This
suggests,
seems
explanation
asymmetry,
frameworks
accuracy
SOKA
model,
need
integrate
an
important
component
explain
asymmetry.
Language: Английский
Developing and Improving Personality Inventories Using Generative Artificial Intelligence: The Psychometric Properties of a Short HEXACO Scale Developed Using ChatGPT 4.0
Journal of Personality Assessment,
Journal Year:
2024,
Volume and Issue:
unknown, P. 1 - 7
Published: Dec. 27, 2024
In
the
current
study,
we
investigated
utility
of
generative
AI
for
survey
development
and
improvement.
To
do
so,
generated
a
24-item
HEXACO
personality
inventory
using
ChatGPT
4.0,
(CHI),
whether
could
modify
CHI
to
either
improve
its
internal
consistency
or
content
validity.
Additionally,
compared
psychometric
properties
different
versions
conceptually
similar
short
inventory.
Specifically,
three
with
Brief
(BHI)
in
terms
their
alpha
reliabilities
convergent
discriminant
correlations
HEXACO-60
criterion-related
validity
authoritarianism
social
dominance
orientation.
Participants
(N
=
682)
completed
BHI
were
randomly
assigned
complete
one
versions.
The
results
showed
generally
comparable
BHI.
However,
not
specific
CHI.
That
is,
although
show
promise
use
developing
questionnaires,
it
may
offer
shortcut
further
properties.
Language: Английский