Applied Linguistics,
Год журнала:
2023,
Номер
45(2), С. 308 - 329
Опубликована: Май 22, 2023
Abstract
The
present
study
assesses
linguistic
and
geographic
diversity
in
selected
outlets
of
SLA
multilingualism
research.
Specifically,
we
examine
over
2,000
articles
published
specialized
top-tier
journals,
recording
the
languages
under
their
acquisition
order,
author
affiliations,
country
which
research
was
conducted,
citations.
In
sample,
there
were
183
unique
174
pairings,
corresponding
to
3
per
cent
world’s
7,000
less
than
0.001
24.5
million
possible
language
combinations.
English
overwhelmingly
most
common
language,
followed
by
Spanish
Mandarin
Chinese.
North
America
Western
Europe
both
main
producers
knowledge
sites
for
on
sample.
Crucially,
regions
with
highest
levels
societal
(typically
Global
South)
only
marginally
represented.
findings
also
show
that
studies
northern
Anglophone
settings
likely
elicit
more
citations
other
settings,
studied
included
frequently
article
titles.
The
social
and
behavioral
sciences
have
been
increasingly
using
automated
text
analysis
to
measure
psychological
constructs
in
text.
We
explore
whether
GPT,
the
large-language
model
underlying
artificial
intelligence
chatbot
ChatGPT,
can
be
used
as
a
tool
for
several
languages.
Across
15
datasets
(n
=
47,925
manually
annotated
tweets
news
headlines),
we
tested
different
versions
of
GPT
(3.5
Turbo,
4,
4
Turbo)
accurately
detect
(sentiment,
discrete
emotions,
offensiveness,
moral
foundations)
across
12
found
that
(r
0.59-0.77)
performs
much
better
than
English-language
dictionary
0.20-0.30)
at
detecting
judged
by
manual
annotators.
nearly
well
as,
sometimes
than,
top-performing
fine-tuned
machine
learning
models.
Moreover,
GPT’s
performance
has
improved
successive
model,
particularly
lesser-spoken
Overall,
may
superior
many
existing
methods
analysis,
since
it
achieves
relatively
high
accuracy
languages,
requires
no
training
data,
is
easy
use
with
simple
prompts
(e.g.,
“is
this
negative?”)
little
coding
experience.
provide
sample
code
video
tutorial
analyzing
application
programming
interface.
argue
other
models
democratize
making
advanced
natural
language
processing
capabilities
more
accessible,
help
facilitate
cross-linguistic
research
understudied
Proceedings of the National Academy of Sciences,
Год журнала:
2024,
Номер
121(34)
Опубликована: Авг. 12, 2024
The
social
and
behavioral
sciences
have
been
increasingly
using
automated
text
analysis
to
measure
psychological
constructs
in
text.
We
explore
whether
GPT,
the
large-language
model
(LLM)
underlying
AI
chatbot
ChatGPT,
can
be
used
as
a
tool
for
several
languages.
Across
15
datasets
(
n
=
47,925
manually
annotated
tweets
news
headlines),
we
tested
different
versions
of
GPT
(3.5
Turbo,
4,
4
Turbo)
accurately
detect
(sentiment,
discrete
emotions,
offensiveness,
moral
foundations)
across
12
found
that
r
0.59
0.77)
performed
much
better
than
English-language
dictionary
0.20
0.30)
at
detecting
judged
by
manual
annotators.
nearly
well
as,
sometimes
than,
top-performing
fine-tuned
machine
learning
models.
Moreover,
GPT’s
performance
improved
successive
model,
particularly
lesser-spoken
languages,
became
less
expensive.
Overall,
may
superior
many
existing
methods
analysis,
since
it
achieves
relatively
high
accuracy
requires
no
training
data,
is
easy
use
with
simple
prompts
(e.g.,
“is
this
negative?”)
little
coding
experience.
provide
sample
code
video
tutorial
analyzing
application
programming
interface.
argue
other
LLMs
help
democratize
making
advanced
natural
language
processing
capabilities
more
accessible,
facilitate
cross-linguistic
research
understudied
Large
language
models
(LLMs)
have
recently
made
vast
advances
in
both
generating
and
analyzing
textual
data.
Technical
reports
often
compare
LLMs’
outputs
with
“human”
performance
on
various
tests.
Here,
we
ask,
“Which
humans?”
Much
of
the
existing
literature
largely
ignores
fact
that
humans
are
a
cultural
species
substantial
psychological
diversity
around
globe
is
not
fully
captured
by
data
which
current
LLMs
been
trained.
We
show
responses
to
measures
an
outlier
compared
large-scale
cross-cultural
data,
their
cognitive
tasks
most
resembles
people
from
Western,
Educated,
Industrialized,
Rich,
Democratic
(WEIRD)
societies
but
declines
rapidly
as
move
away
these
populations
(r
=
-.70).
Ignoring
human
machine
psychology
raises
numerous
scientific
ethical
issues.
close
discussing
ways
mitigate
WEIRD
bias
future
generations
generative
models.
Nature Communications,
Год журнала:
2024,
Номер
15(1)
Опубликована: Июнь 6, 2024
Abstract
Humans
produce
two
forms
of
cognitively
complex
vocalizations:
speech
and
song.
It
is
debated
whether
these
differ
based
primarily
on
culturally
specific,
learned
features,
or
if
acoustical
features
can
reliably
distinguish
them.
We
study
the
spectro-temporal
modulation
patterns
vocalizations
produced
by
369
people
living
in
21
urban,
rural,
small-scale
societies
across
six
continents.
Specific
ranges
spectral
temporal
modulations,
overlapping
within
categories
societies,
significantly
differentiate
from
Machine-learning
classification
shows
that
this
effect
cross-culturally
robust,
being
classified
solely
their
all
societies.
Listeners
unfamiliar
with
cultures
classify
using
similar
cues
as
machine
learning
algorithm.
Finally,
are
better
able
to
discriminate
song
than
a
broad
range
other
variables,
suggesting
modulation—a
key
feature
auditory
neuronal
tuning—accounts
for
fundamental
difference
between
categories.
Both
music
and
language
are
found
in
all
known
human
societies,
yet
no
studies
have
compared
similarities
differences
between
song,
speech,
instrumental
on
a
global
scale.
In
this
Registered
Report,
we
analyzed
two
datasets:
(i)
300
annotated
audio
recordings
representing
matched
sets
of
traditional
songs,
recited
lyrics,
conversational
melodies
from
our
75
coauthors
speaking
55
languages;
(ii)
418
previously
published
adult-directed
song
speech
209
individuals
16
languages.
Of
six
preregistered
predictions,
five
were
strongly
supported:
Relative
to
songs
use
higher
pitch,
slower
temporal
rate,
(iii)
more
stable
pitches,
while
both
used
similar
(iv)
pitch
interval
size
(v)
timbral
brightness.
Exploratory
analyses
suggest
that
features
vary
along
"musi-linguistic"
continuum
when
including
lyrics.
Our
study
provides
strong
empirical
evidence
cross-cultural
regularities
speech.
The
emergence
of
large
language
models
(LLMs)
has
sparked
considerable
interest
in
their
potential
application
psychological
research,
mainly
as
a
model
the
human
psyche
or
general
text-analysis
tool.
However,
trend
using
LLMs
without
sufficient
attention
to
limitations
and
risks,
which
we
rhetorically
refer
"GPTology",
can
be
detrimental
given
easy
access
such
ChatGPT.
Beyond
existing
guidelines,
investigate
current
limitations,
ethical
implications,
specifically
for
show
concrete
impact
various
empirical
studies.
Our
results
highlight
importance
recognizing
global
diversity,
cautioning
against
treating
(especially
zero-shot
settings)
universal
solutions
text
analysis,
developing
transparent,
open
methods
address
LLMs'
opaque
nature
reliable,
reproducible,
robust
inference
from
AI-generated
data.
Acknowledging
utility
task
automation,
annotation,
expand
our
understanding
psychology,
argue
diversifying
samples
expanding
psychology's
methodological
toolbox
promote
an
inclusive,
generalizable
science,
countering
homogenization,
over-reliance
on
LLMs.
Reading Research Quarterly,
Год журнала:
2025,
Номер
60(2)
Опубликована: Фев. 3, 2025
Abstract
In
this
essay,
I
outline
some
of
the
essential
ingredients
a
universal
theory
reading
acquisition,
one
that
seeks
to
highlight
commonalities
while
embracing
global
diversity
languages,
writing
systems,
and
cultures.
begin
by
stressing
need
consider
insights
from
multiple
disciplines
including
neurobiology,
cognitive
science,
linguistics,
socio‐cultural,
historical
inquiry,
although
my
major
emphasis
is
on
systems
approach.
A
theme
common
several
these
perspectives
attain
level
word
speed
effortlessness
necessary
overcome
severe
limitations
human
(sequential)
information
processing
thereby
allowing
reader
devote
maximum
resources
comprehension.
then
present
Combinatorial
Model
—a
learning
read
based
fundamental
principle
spoken
written
language
combinatoriality.
This
(“
infinite
ends
finite
means”
)
makes
it
possible
for
children
learn
how
decipher
(i.e.,
decode),
combine
chunk/unitize
limited
learnable
set
rudimentary
(typically
meaningless)
elements
such
as
letters,
aksharas,
syllabograms,
character
components
into
nested
hierarchy
meaningful
higher‐order
units
morphemes
words
can
be
recognized
instantly
effortlessly
via
rapid
parallel
their
constituent
elements.
Combinatoriality
enables
an
orthography
provide
learnability
decipherability
novice
(via
phonological
transparency
well
unitizability
automatizability
expert
morphemic
).
elaborate
(i)
dual
nature
model
unfamiliar‐to‐familiar/novice‐to‐expert
framework,
(ii)
unit/s
unitization,
(iii)
writing.
liken
development
tree
grows
both
upwards
outwards.
Vertical
growth
thought
3‐phase
progression
sub‐morphemic,
through
morpho‐lexical,
supra‐lexical
phases
in
which
later‐developing
do
not
replace
earlier
but
are
added
combinatorial
hierarchy.
Outward
conceptualized
process
knowledge
arborization
—ongoing
refinement,
elaboration,
diversification.
conclude
noting
that,
despite
important
recent
advances,
our
non‐European
non‐alphabetic
still
its
infancy.
Current
research
over‐reliant
English—an
outlier
orthography—together
with
handful
Roman‐script
Western
European
languages.
has
led
science
neglect
many
issues
significance
homography,
tone,
diacritics,
visual
complexity,
non‐linearity,
linguistic
distance,
multilingualism,
multiscriptism,
more.
An
appreciation
specifics
particular
(or
languages)
orthographies)
child
within
broader
context
linguistic,
orthographic,
cultural
crucial
only
deeper
understanding
specific
truly
non‐ethnocentric
reading.