Compositional
generalisation
(CG),
in
NLP
and
machine
learning
more
generally,
has
been
assessed
mostly
using
artificial
datasets.
It
is
important
to
develop
benchmarks
assess
CG
also
real-world
natural
language
tasks
order
understand
the
abilities
limitations
of
systems
deployed
wild.
To
this
end,
our
GenBench
Collaborative
Benchmarking
Task
submission
utilises
distribution-based
compositionality
assessment
(DBCA)
framework
split
Europarl
translation
corpus
into
a
training
test
set
such
way
that
requires
compositional
capacity.
Specifically,
sets
have
divergent
distributions
dependency
relations,
testing
NMT
systems’
capability
translating
dependencies
they
not
trained
on.
This
fully-automated
procedure
create
benchmarks,
making
it
simple
inexpensive
apply
further
other
datasets
languages.
The
code
data
for
experiments
available
at
https://github.com/aalto-speech/dbca.
AI & Society,
Journal Year:
2024,
Volume and Issue:
unknown
Published: Nov. 28, 2024
Abstract
Large
language
models
(LLMs)
are
revolutionizing
several
areas
of
Artificial
Intelligence.
One
the
most
remarkable
applications
is
creative
writing,
e.g.,
poetry
or
storytelling:
generated
outputs
often
astonishing
quality.
However,
a
natural
question
arises:
can
LLMs
be
really
considered
creative?
In
this
article,
we
first
analyze
development
under
lens
creativity
theories,
investigating
key
open
questions
and
challenges.
particular,
focus
our
discussion
on
dimensions
value,
novelty,
surprise
as
proposed
by
Margaret
Boden
in
her
work.
Then,
consider
different
classic
perspectives,
namely
product,
process,
press,
person.
We
discuss
set
“easy”
“hard”
problems
machine
creativity,
presenting
them
relation
to
LLMs.
Finally,
examine
societal
impact
these
technologies
with
particular
industries,
analyzing
opportunities
offered,
challenges
arising
from
them,
potential
associated
risks,
both
legal
ethical
points
view.
SSRN Electronic Journal,
Journal Year:
2024,
Volume and Issue:
unknown
Published: Jan. 1, 2024
Artificial
intelligence
(AI)
now
matches
or
outperforms
human
in
an
astonishing
array
of
games,
tests,
and
other
cognitive
tasks
that
involve
high-level
reasoning
thinking.
Many
scholars
argue
that—due
to
bias
bounded
rationality—humans
should
(or
will
soon)
be
replaced
by
AI
situations
involving
cognition
strategic
decision
making.
We
disagree.
In
this
paper
we
first
trace
the
historical
origins
idea
artificial
as
a
form
computation
information
processing.
highlight
problems
with
analogy
between
computers
minds
input-output
devices,
using
large
language
models
example.
Human
cognition—in
important
instances—is
better
conceptualized
theorizing
rather
than
data
processing,
prediction,
even
Bayesian
updating.
Our
argument,
when
it
comes
cognition,
is
AI's
data-based
prediction
different
from
theory-based
causal
logic.
introduce
belief-data
(a)symmetries
difference
use
"heavier-than-air
flight"
example
our
arguments.
Theories
provide
mechanism
for
identifying
new
evidence,
way
"intervening"
world,
experimenting,
problem
solving.
conclude
discussion
implications
arguments
making,
including
role
human-AI
hybrids
might
play
process.
The
proliferation
of
AI-powered
search
and
recommendation
systems
has
accelerated
the
formation
"filter
bubbles"
that
reinforce
people's
biases
narrow
their
perspectives.
Previous
research
attempted
to
address
this
issue
by
increasing
diversity
information
exposure,
which
is
often
hindered
a
lack
user
motivation
engage
with.
In
study,
we
took
human-centered
approach
explore
how
Large
Language
Models
(LLMs)
could
assist
users
in
embracing
more
diverse
We
developed
prototype
featuring
LLM-powered
multi-agent
characters
interact
with
while
reading
social
media
content.
conducted
participatory
design
study
18
participants
found
dialogues
gamification
incentives
motivate
opposing
viewpoints.
Additionally,
progressive
interactions
assessment
tasks
promote
thoughtful
consideration.
Based
on
these
findings,
provided
implications
future
work
outlooks
for
leveraging
LLMs
help
burst
filter
bubbles.
Strategy Science,
Journal Year:
2024,
Volume and Issue:
unknown
Published: Dec. 3, 2024
Scholars
argue
that
artificial
intelligence
(AI)
can
generate
genuine
novelty
and
new
knowledge
and,
in
turn,
AI
computational
models
of
cognition
will
replace
human
decision
making
under
uncertainty.
We
disagree.
AI’s
data-based
prediction
is
different
from
theory-based
causal
logic
reasoning.
highlight
problems
with
the
decades-old
analogy
between
computers
minds
as
input–output
devices,
using
large
language
an
example.
Human
better
conceptualized
a
form
reasoning
rather
than
emphasis
on
information
processing
prediction.
uses
probability-based
approach
to
largely
backward
looking
imitative,
whereas
forward-looking
capable
generating
novelty.
introduce
idea
data–belief
asymmetries
difference
cognition,
example
heavier-than-air
flight
illustrate
our
arguments.
Theory-based
provides
cognitive
mechanism
for
humans
intervene
world
engage
directed
experimentation
data.
Throughout
article,
we
discuss
implications
argument
understanding
origins
novelty,
knowledge,
Asia Pacific Journal of Education,
Journal Year:
2024,
Volume and Issue:
44(1), P. 81 - 93
Published: Jan. 2, 2024
Generative
Artificial
Intelligence
(AI)'s
emergence
is
viewed
as
a
disruptive
technological
advancement
that
has
been
beneficial
for
most
educational
purposes
but
also
coupled
with
emerging
challenges
and
potentially
destabilizing
effects.
Given
the
unprecedented
onset
surge
in
interests,
education
stakeholders
are
often
pressured
to
adopt
such
emergent
technologies
little
space
time
seek
better
understanding
attain
literacy.
This
paper
brings
together
existing
contributions
identify
list
of
five
common
themes
(5Ts)
various
uses
generative
AI
improving
students
learning
future
research.
The
opportunities
from
use
were
explored,
part
rethink
how
can
continue
be
relevant
dynamic
environment
technologies,
three
"R"
guidelines
(3Rs)
proposed
aid
educators
stay
ahead
curve
addressing
embracing
arising
learning.
Nature Communications,
Journal Year:
2024,
Volume and Issue:
15(1)
Published: Oct. 14, 2024
Humans
excel
at
extracting
structurally-determined
meaning
from
speech
despite
inherent
physical
variability.
This
study
explores
the
brain's
ability
to
predict
and
understand
spoken
language
robustly.
It
investigates
relationship
between
structural
statistical
knowledge
in
brain
dynamics,
focusing
on
phase
amplitude
modulation.
Using
syntactic
features
constituent
hierarchies
surface
statistics
a
transformer
model
as
predictors
of
forward
encoding
models,
we
reconstructed
cross-frequency
neural
dynamics
MEG
data
during
audiobook
listening.
Our
findings
challenge
strict
separation
linguistic
structure
brain,
with
both
aiding
signal
reconstruction.
Syntactic
have
more
temporally
spread
impact,
word
entropy
number
closing
constituents
are
linked
phase-amplitude
coupling
implying
role
temporal
prediction
cortical
oscillation
alignment
processing.
results
indicate
that
structured
information
jointly
shape
comprehension
suggest
an
integration
process
via
mechanism.
Generative
audio
models
typically
focus
their
applications
in
music
and
speech
generation,
with
recent
having
human-like
quality
output.
This
paper
conducts
a
systematic
literature
review
of
884
papers
the
area
generative
order
to
both
quantify
degree
which
researchers
field
are
considering
potential
negative
impacts
identify
types
ethical
implications
this
need
consider.
Though
65%
research
note
positive
work,
less
than
10%
discuss
any
impacts.
jarringly
small
percentage
impact
is
particularly
worrying
because
issues
brought
light
by
few
doing
so
raising
serious
concerns
relevant
broader
such
as
for
fraud,
deep-fakes,
copyright
infringement.
By
quantifying
lack
consideration
identifying
key
areas
harm,
lays
groundwork
future
work
at
critical
point
time
guide
more
conscientious
progresses.
Deleted Journal,
Journal Year:
2025,
Volume and Issue:
4(2)
Published: May 2, 2025
Abstract
In
2002,
George
Tzanetakis
presented
a
paper
on
how
researchers
could
automatically
classify
musical
genre
from
audio
signals.
Claiming
that
his
model
worked
as
well
human
classifiers,
made
dataset
available
to
anyone
who
asked
for
it.
Ten
years
later,
systematic
review
found
this
had
circulated
massive
scale
—
nearly
25%
of
papers
Music
Genre
Recognition
(MGR)
used
the
so-called
GTZAN
in
their
research.
Yet
an
analysis
revealed
significant
problems:
repetitions,
overrepresentations,
files
distorted
point
corruption,
with
few
indicating
they
ever
listened
within
These
warnings
went
unheeded:
remains
most
widely
MGR
today.
paper,
I
examine
historical
and
musicological
perspective.
trace
dataset’s
introduction
into
Information
Retrieval
(MIR)
community,
show
MIR
researchers’
tendency
view
digital
object
set
extracted
statistical
features,
static,
query-able
combination
those
created
ground
truths
about
music
remain
embedded
our
present-day
infrastructures.
argue
tracing
history
ascendence
benchmark
status
can
recover
ground-truthing
process
by
early
researchers.
addition,
provides
context
industry’s
shift
towards
descriptive
tagging
context-based
recommendations.