Computational Linguistics,
Journal Year:
2023,
Volume and Issue:
50(1), P. 237 - 291
Published: Dec. 12, 2023
Abstract
Large
language
models
(LLMs)
are
capable
of
successfully
performing
many
processing
tasks
zero-shot
(without
training
data).
If
LLMs
can
also
reliably
classify
and
explain
social
phenomena
like
persuasiveness
political
ideology,
then
could
augment
the
computational
science
(CSS)
pipeline
in
important
ways.
This
work
provides
a
road
map
for
using
as
CSS
tools.
Towards
this
end,
we
contribute
set
prompting
best
practices
an
extensive
evaluation
to
measure
performance
13
on
25
representative
English
benchmarks.
On
taxonomic
labeling
(classification),
fail
outperform
fine-tuned
but
still
achieve
fair
levels
agreement
with
humans.
free-form
coding
(generation),
produce
explanations
that
often
exceed
quality
crowdworkers’
gold
references.
We
conclude
today’s
research
two
ways:
(1)
serving
data
annotators
human
annotation
teams,
(2)
bootstrapping
challenging
creative
generation
(e.g.,
explaining
underlying
attributes
text).
In
summary,
posed
meaningfully
participate
analysis
partnership
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
Journal Year:
2022,
Volume and Issue:
unknown
Published: June 1, 2022
Large
pre-trained
models
such
as
CLIP
or
ALIGN
offer
consistent
accuracy
across
a
range
of
data
distributions
when
performing
zero-shot
inference
(i.e.,
without
fine-tuning
on
specific
dataset).
Although
existing
methods
substantially
improve
given
target
distribution,
they
often
reduce
robustness
to
distribution
shifts.
We
address
this
tension
by
introducing
simple
and
effective
method
for
improving
while
fine-tuning:
ensembling
the
weights
fine-tuned
(WiSE-FT).
Compared
standard
fine-tuning,
WiSE-FT
provides
large
improvements
under
shift,
preserving
high
distribution.
On
ImageNet
five
derived
shifts,
improves
shift
4
6
percentage
points
(pp)
over
prior
work
increasing
1.6
pp.
achieves
similarly
gains
(2
23
pp)
diverse
set
six
further
0.8
3.3
pp
compared
commonly
used
transfer
learning
datasets.
These
come
at
no
additional
computational
cost
during
inference.
arXiv (Cornell University),
Journal Year:
2021,
Volume and Issue:
unknown
Published: Jan. 1, 2021
This
paper
explores
the
environmental
impact
of
super-linear
growth
trends
for
AI
from
a
holistic
perspective,
spanning
Data,
Algorithms,
and
System
Hardware.
We
characterize
carbon
footprint
computing
by
examining
model
development
cycle
across
industry-scale
machine
learning
use
cases
and,
at
same
time,
considering
life
system
hardware.
Taking
step
further,
we
capture
operational
manufacturing
present
an
end-to-end
analysis
what
how
hardware-software
design
at-scale
optimization
can
help
reduce
overall
AI.
Based
on
industry
experience
lessons
learned,
share
key
challenges
chart
out
important
directions
many
dimensions
hope
messages
insights
presented
in
this
inspire
community
to
advance
field
environmentally-responsible
manner.
IEEE Journal of Selected Topics in Signal Processing,
Journal Year:
2022,
Volume and Issue:
16(6), P. 1179 - 1210
Published: Sept. 15, 2022
Although
supervised
deep
learning
has
revolutionized
speech
and
audio
processing,
it
necessitated
the
building
of
specialist
models
for
individual
tasks
application
scenarios.
It
is
likewise
difficult
to
apply
this
dialects
languages
which
only
limited
labeled
data
available.
Self-supervised
representation
methods
promise
a
single
universal
model
that
would
benefit
wide
variety
domains.
Such
have
shown
success
in
natural
language
processing
computer
vision
domains,
achieving
new
levels
performance
while
reducing
number
labels
required
many
downstream
Speech
experiencing
similar
progress
three
main
categories:
generative,
contrastive,
predictive
methods.
Other
approaches
rely
on
multi-modal
pre-training,
mixing
text
or
visual
streams
with
speech.
self-supervised
still
nascent
research
area,
closely
related
acoustic
word
embedding
zero
lexical
resources,
both
seen
active
years.
This
review
presents
their
connection
other
areas.
Since
current
focus
solely
automatic
recognition
as
task,
we
recent
efforts
benchmarking
learned
representations
extend
beyond
recognition.
Computational Linguistics,
Journal Year:
2023,
Volume and Issue:
50(1), P. 237 - 291
Published: Dec. 12, 2023
Abstract
Large
language
models
(LLMs)
are
capable
of
successfully
performing
many
processing
tasks
zero-shot
(without
training
data).
If
LLMs
can
also
reliably
classify
and
explain
social
phenomena
like
persuasiveness
political
ideology,
then
could
augment
the
computational
science
(CSS)
pipeline
in
important
ways.
This
work
provides
a
road
map
for
using
as
CSS
tools.
Towards
this
end,
we
contribute
set
prompting
best
practices
an
extensive
evaluation
to
measure
performance
13
on
25
representative
English
benchmarks.
On
taxonomic
labeling
(classification),
fail
outperform
fine-tuned
but
still
achieve
fair
levels
agreement
with
humans.
free-form
coding
(generation),
produce
explanations
that
often
exceed
quality
crowdworkers’
gold
references.
We
conclude
today’s
research
two
ways:
(1)
serving
data
annotators
human
annotation
teams,
(2)
bootstrapping
challenging
creative
generation
(e.g.,
explaining
underlying
attributes
text).
In
summary,
posed
meaningfully
participate
analysis
partnership