Computational Linguistics,
Год журнала:
2023,
Номер
50(1), С. 237 - 291
Опубликована: Дек. 12, 2023
Abstract
Large
language
models
(LLMs)
are
capable
of
successfully
performing
many
processing
tasks
zero-shot
(without
training
data).
If
LLMs
can
also
reliably
classify
and
explain
social
phenomena
like
persuasiveness
political
ideology,
then
could
augment
the
computational
science
(CSS)
pipeline
in
important
ways.
This
work
provides
a
road
map
for
using
as
CSS
tools.
Towards
this
end,
we
contribute
set
prompting
best
practices
an
extensive
evaluation
to
measure
performance
13
on
25
representative
English
benchmarks.
On
taxonomic
labeling
(classification),
fail
outperform
fine-tuned
but
still
achieve
fair
levels
agreement
with
humans.
free-form
coding
(generation),
produce
explanations
that
often
exceed
quality
crowdworkers’
gold
references.
We
conclude
today’s
research
two
ways:
(1)
serving
data
annotators
human
annotation
teams,
(2)
bootstrapping
challenging
creative
generation
(e.g.,
explaining
underlying
attributes
text).
In
summary,
posed
meaningfully
participate
analysis
partnership
Ophthalmology Science,
Год журнала:
2023,
Номер
3(4), С. 100324 - 100324
Опубликована: Май 5, 2023
Foundation
models
are
a
novel
type
of
artificial
intelligence
algorithms,
in
which
pretrained
at
scale
on
unannotated
data
and
fine-tuned
for
myriad
downstream
tasks,
such
as
generating
text.
This
study
assessed
the
accuracy
ChatGPT,
large
language
model
(LLM),
ophthalmology
question-answering
space.
Business & Information Systems Engineering,
Год журнала:
2023,
Номер
66(1), С. 111 - 126
Опубликована: Сен. 12, 2023
The
term
"generative
AI"
refers
to
computational
techniques
that
are
capable
of
generating
seemingly
new,
meaningful
content
such
as
text,
images,
or
audio
from
training
data.
widespread
diffusion
this
technology
with
examples
Dall-E
2,
GPT-4,
and
Copilot
is
currently
revolutionizing
the
way
we
work
communicate
each
other.
In
article,
provide
a
conceptualization
generative
AI
an
entity
in
socio-technical
systems
models,
systems,
applications.
Based
on
that,
introduce
limitations
current
agenda
for
Business
&
Information
Systems
Engineering
(BISE)
research.
Different
previous
works,
focus
context
information
and,
end,
discuss
several
opportunities
challenges
unique
BISE
community
make
suggestions
impactful
directions
arXiv (Cornell University),
Год журнала:
2021,
Номер
unknown
Опубликована: Янв. 1, 2021
Automated
visual
understanding
of
our
diverse
and
open
world
demands
computer
vision
models
to
generalize
well
with
minimal
customization
for
specific
tasks,
similar
human
vision.
Computer
foundation
models,
which
are
trained
on
diverse,
large-scale
dataset
can
be
adapted
a
wide
range
downstream
critical
this
mission
solve
real-world
applications.
While
existing
such
as
CLIP,
ALIGN,
Wu
Dao
2.0
focus
mainly
mapping
images
textual
representations
cross-modal
shared
representation,
we
introduce
new
model,
Florence,
expand
the
from
coarse
(scene)
fine
(object),
static
(images)
dynamic
(videos),
RGB
multiple
modalities
(caption,
depth).
By
incorporating
universal
visual-language
Web-scale
image-text
data,
Florence
model
easily
various
classification,
retrieval,
object
detection,
VQA,
image
caption,
video
retrieval
action
recognition.
Moreover,
demonstrates
outstanding
performance
in
many
types
transfer
learning:
fully
sampled
fine-tuning,
linear
probing,
few-shot
zero-shot
novel
objects.
All
these
properties
serve
general
purpose
tasks.
achieves
state-of-the-art
results
majority
44
representative
benchmarks,
e.g.,
ImageNet-1K
classification
top-1
accuracy
83.74
top-5
97.18,
62.4
mAP
COCO
tuning,
80.36
87.8
Kinetics-600.
As
individuals
and
communities
interact
in
with
an
environment
that
is
increasingly
virtual
they
are
often
vulnerable
to
the
commodification
of
their
digital
exhaust.
Concepts
behavior
ambiguous
nature
captured
this
environment,
quantified,
used
categorize,
sort,
recommend,
or
make
decisions
about
people's
lives.
While
many
organizations
seek
utilize
information
a
responsible
manner,
biases
remain
endemic
across
technology
processes
can
lead
harmful
impacts
regardless
intent.
These
outcomes,
even
if
inadvertent,
create
significant
challenges
for
cultivating
public
trust
artificial
intelligence
(AI).
SP
1270
NIST
Artificial
Intelligence
publication
should
be
read
conjunction
all
publications
AI
Series,
which
was
established
January
2023.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
Год журнала:
2022,
Номер
unknown, С. 15617 - 15629
Опубликована: Июнь 1, 2022
State-of-the-art
vision
and
vision-and-language
models
rely
on
large-scale
visio-linguistic
pretraining
for
obtaining
good
performance
a
variety
of
downstream
tasks.
Generally,
such
are
often
either
cross-modal
(contrastive)
or
multi-modal
(with
earlier
fusion)
but
not
both;
they
only
target
specific
modalities
A
promising
direction
would
be
to
use
single
holistic
universal
model,
as
"foundation",
that
targets
all
at
once-a
true
language
foundation
model
should
tasks,
cross-
We
introduce
FLAVA
demonstrate
impressive
wide
range
35
tasks
spanning
these
modalities.
The
use
of
artificial
intelligence
in
academia
is
a
hot
topic
the
education
field.
chatAPIs
and
GPT-3
higher
has
potential
to
offer
range
benefits,
including
increased
student
engagement,
collaboration,
accessibility.
However,
these
tools
also
raise
number
challenges
concerns,
particularly
relation
academic
honesty
plagiarism.
This
paper
examines
opportunities
using
education,
with
focus
on
risks
rewards
ways
which
universities
can
address
they
pose.
discusses
main
features
capabilities
provides
examples
their
education.
It
considers
for
be
used
dishonesty
difficulties
detecting
preventing
such
abuses.
Finally,
suggests
strategies
that
adopt
ensure
are
ethically
responsibly,
developing
policies
procedures,
providing
training
support,
variety
methods
detect
prevent
cheating.
IEEE Transactions on Pattern Analysis and Machine Intelligence,
Год журнала:
2024,
Номер
46(8), С. 5227 - 5244
Опубликована: Апрель 3, 2024
The
foundation
model
has
recently
garnered
significant
attention
due
to
its
potential
revolutionize
the
field
of
visual
representation
learning
in
a
self-supervised
manner.
While
most
models
are
tailored
effectively
process
RGB
images
for
various
tasks,
there
is
noticeable
gap
research
focused
on
spectral
data,
which
offers
valuable
information
scene
understanding,
especially
remote
sensing
(RS)
applications.
To
fill
this
gap,
we
created
first
time
universal
RS
model,
named
SpectralGPT,
purpose-built
handle
using
novel
3D
generative
pretrained
transformer
(GPT).
Compared
existing
models,
SpectralGPT
1)
accommodates
input
with
varying
sizes,
resolutions,
series,
and
regions
progressive
training
fashion,
enabling
full
utilization
extensive
Big
Data;
2)
leverages
token
generation
spatial-spectral
coupling;
3)
captures
spectrally
sequential
patterns
via
multi-target
reconstruction;
4)
trains
one
million
images,
yielding
over
600
parameters.
Our
evaluation
highlights
performance
improvements
signifying
substantial
advancing
Data
applications
within
geoscience
across
four
downstream
tasks:
single/multi-label
classification,
semantic
segmentation,
change
detection.
CHI Conference on Human Factors in Computing Systems,
Год журнала:
2022,
Номер
unknown, С. 1 - 22
Опубликована: Апрель 28, 2022
Although
large
language
models
(LLMs)
have
demonstrated
impressive
potential
on
simple
tasks,
their
breadth
of
scope,
lack
transparency,
and
insufficient
controllability
can
make
them
less
effective
when
assisting
humans
more
complex
tasks.
In
response,
we
introduce
the
concept
Chaining
LLM
steps
together,
where
output
one
step
becomes
input
for
next,
thus
aggregating
gains
per
step.
We
first
define
a
set
primitive
operations
useful
Chain
construction,
then
present
an
interactive
system
users
modify
these
Chains,
along
with
intermediate
results,
in
modular
way.
20-person
user
study,
found
that
not
only
improved
quality
task
outcomes,
but
also
significantly
enhanced
controllability,
sense
collaboration.
Additionally,
saw
developed
new
ways
interacting
LLMs
through
Chains:
they
leveraged
sub-tasks
to
calibrate
model
expectations,
compared
contrasted
alternative
strategies
by
observing
parallel
downstream
effects,
debugged
unexpected
outputs
"unit-testing"
sub-components
Chain.
two
case
studies,
further
explore
how
Chains
may
be
used
future
applications.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
Год журнала:
2022,
Номер
unknown, С. 7076 - 7086
Опубликована: Июнь 1, 2022
Image
segmentation
is
usually
addressed
by
training
a
model
for
fixed
set
of
object
classes.
Incorporating
additional
classes
or
more
complex
queries
later
expensive
as
it
requires
re-training
the
on
dataset
that
encompasses
these
expressions.
Here
we
propose
system
can
generate
image
segmentations
based
arbitrary
prompts
at
test
time.
A
prompt
be
either
text
an
image.
This
approach
enables
us
to
create
unified
(trained
once)
three
common
tasks,
which
come
with
distinct
challenges:
referring
expression
segmentation,
zero-shot
and
one-shot
segmentation.
We
build
upon
CLIP
backbone
extend
transformer-based
decoder
dense
prediction.
After
extended
version
PhraseCut
dataset,
our
generates
binary
map
free-text
expressing
query.
analyze
different
variants
latter
image-based
in
detail.
novel
hybrid
input
allows
dynamic
adaptation
not
only
tasks
mentioned
above,
but
any
task
where
query
formulated.
Finally,
find
adapt
well
generalized
involving
affordances
properties.
Code
available
https://eckerlab.org/code/CLIPSeg