bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2022,
Volume and Issue:
unknown
Published: Nov. 22, 2022
Abstract
Perception
in
the
mature
human
visual
system
relies
heavily
on
prior
knowledge.
Here
we
show
for
first
time
that
prior-knowledge-induced
reshaping
of
perception
emerges
gradually,
and
late
childhood.
To
isolate
effects
knowledge
vision,
presented
4-to-12-year-olds
adults
with
two-tone
images,
which
are
degraded
photos
hard
to
recognise
viewing.
In
adults,
seeing
original
photo
causes
a
perceptual
reorganisation
leading
sudden,
mandatory
recognition
version
-
well-documented
process
relying
top-down
signalling
from
higher-order
brain
areas
early
cortex.
We
find
children
younger
than
7
9
years,
however,
do
not
experience
this
knowledge-guided
shift,
despite
viewing
immediately
before
each
two-tone.
assess
potential
computations
underlying
development
compared
performance
three
state-of-the-art
neural
networks
varying
architectures.
found
best-performing
architecture
behaved
much
like
4-
5-year-old
humans,
who
display
feature-based
rather
holistic
processing
strategy
akin
networks.
Our
results
reveal
striking
age-related
shift
reconciliation
sensory
input,
may
underpin
many
abilities.
Behavioral and Brain Sciences,
Journal Year:
2022,
Volume and Issue:
46
Published: Dec. 1, 2022
Abstract
Deep
neural
networks
(DNNs)
have
had
extraordinary
successes
in
classifying
photographic
images
of
objects
and
are
often
described
as
the
best
models
biological
vision.
This
conclusion
is
largely
based
on
three
sets
findings:
(1)
DNNs
more
accurate
than
any
other
model
taken
from
various
datasets,
(2)
do
job
predicting
pattern
human
errors
behavioral
(3)
brain
signals
response
to
datasets
(e.g.,
single
cell
responses
or
fMRI
data).
However,
these
not
test
hypotheses
regarding
what
features
contributing
good
predictions
we
show
that
may
be
mediated
by
share
little
overlap
with
More
problematically,
account
for
almost
no
results
psychological
research.
contradicts
common
claim
good,
let
alone
best,
object
recognition.
We
argue
theorists
interested
developing
biologically
plausible
vision
need
direct
their
attention
explaining
findings.
generally,
build
explain
experiments
manipulate
independent
variables
designed
rather
compete
making
predictions.
conclude
briefly
summarizing
promising
modeling
approaches
focus
data.
Annual Review of Vision Science,
Journal Year:
2023,
Volume and Issue:
9(1), P. 501 - 524
Published: March 31, 2023
Deep
neural
networks
(DNNs)
are
machine
learning
algorithms
that
have
revolutionized
computer
vision
due
to
their
remarkable
successes
in
tasks
like
object
classification
and
segmentation.
The
success
of
DNNs
as
has
led
the
suggestion
may
also
be
good
models
human
visual
perception.
In
this
article,
we
review
evidence
regarding
current
adequate
behavioral
core
recognition.
To
end,
argue
it
is
important
distinguish
between
statistical
tools
computational
understand
model
quality
a
multidimensional
concept
which
clarity
about
modeling
goals
key.
Reviewing
large
number
psychophysical
explorations
recognition
performance
humans
DNNs,
highly
valuable
scientific
but
that,
today,
should
only
regarded
promising-but
not
yet
adequate-computational
behavior.
On
way,
dispel
several
myths
surrounding
science.
Deep
neural
networks
(DNNs)
have
had
extraordinary
successes
in
classifying
photographic
images
of
objects
and
are
often
described
as
the
best
models
biological
vision.
This
conclusion
is
largely
based
on
three
sets
findings:
(1)
DNNs
more
accurate
than
any
other
model
taken
from
various
datasets,
(2)
do
job
predicting
pattern
human
errors
behavioral
benchmark
(3)
brain
signals
response
to
datasets
(e.g.,
single
cell
responses
or
fMRI
data).
However,
most
benchmarks
report
outcomes
observational
experiments
that
not
manipulate
independent
variables,
we
show
good
prediction
these
may
be
mediated
by
share
little
overlap
with
More
problematically,
account
for
almost
no
results
psychological
research.
contradicts
common
claim
good,
let
alone
best,
object
recognition.
We
argue
theorists
interested
developing
biologically
plausible
vision
need
direct
their
attention
explaining
findings.
generally,
build
explain
variables
designed
test
hypotheses
rather
compete
data.
conclude
briefly
summarizing
promising
modelling
approaches
focus
Measurement Sensors,
Journal Year:
2024,
Volume and Issue:
31, P. 101025 - 101025
Published: Jan. 8, 2024
In
this
paper,
proposed
a
novel
method
to
improve
the
localisation
precision
of
identified
objects.
We
present
framework
for
iteratively
enhancing
image
region
recommendations
meet
ground
truth
values
in
research.
The
Faster
R–CNN
(FR-CNN)
seems
be
an
object
recognition
deep
convolutional
network.
It
gives
user
impression
that
network
is
cohesive
and
single.
can
provide
accurate
timely
predictions
about
whereabouts
range
first
build
unified
model
based
on
rapid
relocate
inaccurate
area
recommendations.
Because
emphasis
detection,
it
may
utilized
with
wide
datasets
compatible
various
FR-CNN
architectures.
Second,
we
focus
application
joint
score
function
variety
picture
features.
This
depicts
location
concealed
concerning
other
data
updated
structured
production
loss
are
only
two
inputs
influence
parameters
scoring
function.
join-score
iterative
context
refinement
(CIR)
used
generate
our
final
model,
which
then
classified
using
Smooth
Support
Vector
Machine
(SSVM).
measured
accuracy
mean
average
after
training
+
CIR
SSVM
low-cost
GPU
PASCAL
VOC
2012
dataset.
Our
results
3.6
%
more
exact
than
rival
learning
algorithms
average.
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2024,
Volume and Issue:
unknown
Published: March 3, 2024
ABSTRACT
Eye-tracking
is
an
essential
tool
in
many
fields,
yet
existing
solutions
are
often
limited
for
customized
applications
due
to
cost
or
lack
of
flexibility.
We
present
OpenIris,
adaptable
and
user-friendly
open-source
framework
video-based
eye-tracking.
OpenIris
developed
C#
with
modular
design
that
allows
further
extension
customization
through
plugins
different
hardware
systems,
tracking,
calibration
pipelines.
It
can
be
remotely
controlled
via
a
network
interface
from
other
devices
programs.
Eye
movements
recorded
online
camera
stream
offline
post-processing
videos.
Example
have
been
track
eye
motion
3-D,
including
torsion.
Currently
implemented
binocular
pupil
tracking
pipelines
achieve
frame
rates
more
than
500Hz.
With
the
framework,
we
aim
fill
gap
research
tools
available
high-precision
high-speed
eye-tracking,
especially
environments
require
custom
not
currently
well-served
by
commercial
eye-trackers.
CCS
CONCEPTS
Applied
computing
→
Life
medical
sciences.
PLoS Computational Biology,
Journal Year:
2025,
Volume and Issue:
21(1), P. e1012751 - e1012751
Published: Jan. 27, 2025
The
human
visual
system
possesses
a
remarkable
ability
to
detect
and
process
faces
across
diverse
contexts,
including
the
phenomenon
of
face
pareidolia—–seeing
in
inanimate
objects.
Despite
extensive
research,
it
remains
unclear
why
employs
such
broadly
tuned
detection
capabilities.
We
hypothesized
that
pareidolia
results
from
system’s
optimization
for
recognizing
both
To
test
this
hypothesis,
we
used
task-optimized
deep
convolutional
neural
networks
(CNNs)
evaluated
their
alignment
with
behavioral
signatures
responses,
measured
via
magnetoencephalography
(MEG),
related
processing.
Specifically,
trained
CNNs
on
tasks
involving
combinations
identification,
detection,
object
categorization,
detection.
Using
representational
similarity
analysis,
found
included
categorization
training
represented
faces,
real
matched
objects
more
similarly
responses
than
those
did
not.
Although
these
showed
similar
overall
data,
closer
examination
internal
representations
revealed
specific
had
distinct
effects
how
were
layers.
Finally,
interpretability
methods
only
CNN
identification
relied
face-like
features—such
as
‘eyes’—to
classify
stimuli
mirroring
findings
perception.
Our
suggest
human-like
may
emerge
within
context
generalized
categorization.
Object
recognition
is
an
important
human
ability
that
relies
on
distinguishing
between
similar
objects,
for
example,
deciding
which
kitchen
utensil(s)
to
use
at
different
stages
of
meal
preparation.
Recent
work
describes
the
fine-grained
organization
knowledge
about
manipulable
objects
via
study
constituent
dimensions
are
most
relevant
behavior,
vision,
manipulation,
and
function-based
object
properties.
A
logical
extension
this
concerns
whether
or
not
these
uniquely
human,
can
be
approximated
by
deep
learning.
Here,
we
show
behavioral
well-predicted
a
state-of-the-art
multimodal
network
trained
large
diverse
set
image-text
pairs
-
CLIP-ViT
part,
also
generate
good
predictions
behavior
previously
unseen
objects.
Moreover,
model
vastly
outperforms
comparison
networks
pre-trained
with
smaller,
image-only
training
datasets.
These
results
demonstrate
impressive
capacity
approximate
knowledge.
We
discuss
possible
sources
benefit
relative
other
tested
models
(e.g.
pre-training
vs.
image
only
pre-training,
dataset
size,
architecture).
Journal of Neuroscience,
Journal Year:
2024,
Volume and Issue:
44(42), P. e0349242024 - e0349242024
Published: Aug. 28, 2024
The
visual
world
is
richly
adorned
with
texture,
which
can
serve
to
delineate
important
elements
of
natural
scenes.
In
anesthetized
macaque
monkeys,
selectivity
for
the
statistical
features
texture
weak
in
V1,
but
substantial
V2,
suggesting
that
neuronal
activity
V2
might
directly
support
perception.
To
test
this,
we
investigated
relation
between
single
cell
V1
and
simultaneously
measured
behavioral
judgments
texture.
We
generated
stimuli
along
a
continuum
naturalistic
phase-randomized
noise
trained
two
monkeys
judge
whether
sample
more
closely
resembled
one
or
other
extreme.
Analysis
responses
revealed
individual
neurons
carried
much
less
information
about
naturalness
than
reports.
However,
sensitivity
neurons,
especially
those
preferring
textures,
was
significantly
closer
behavior
compared
V1.
firing
both
predicted
perceptual
choices
response
repeated
presentations
same
ambiguous
stimulus
monkey,
despite
low
neural
sensitivity.
neither
population
choice
second
monkey.
conclude
supporting
perception
likely
continue
develop
downstream
V2.
Further,
combined
data
recorded
while
performed
an
orientation
discrimination
task,
our
results
demonstrate
choice-correlated
early
sensory
cortex
unstable
across
observers
tasks,
untethered
from
sensitivity,
therefore
unlikely
reflect
formation
decisions.