2022 26th International Conference on Pattern Recognition (ICPR),
Год журнала:
2022,
Номер
6791, С. 4248 - 4255
Опубликована: Авг. 21, 2022
Deep
learning
models
thrive
with
high
amounts
of
data
where
the
classes
are,
usually,
appropriately
balanced.
In
medical
imaging,
however,
we
often
encounter
opposite
case.
Wireless
Capsule
Endoscopy
is
not
an
exception;
even
if
huge
could
be
obtained,
labeling
each
frame
a
video
take
up
to
twelve
hours
for
expert
physician.
Those
videos
would
show
no
pathologies
most
patients,
while
minority
have
few
frames
associated
pathology.
Overall,
there
low
and
great
unbalance.
Self-supervised
provides
means
use
unlabelled
initialize
that
can
perform
better
under
described
circumstance.
We
propose
novel
contrastive
loss
derived
from
Triplet
Loss,
crafted
leverage
temporal
information
in
endoscopy
videos.
our
model
outperforms
existing
other
methods
several
tasks.
IEEE Access,
Год журнала:
2022,
Номер
10, С. 122762 - 122785
Опубликована: Янв. 1, 2022
The
fast
progress
in
digital
technology
has
sparked
generation
of
large
number
voluminous
data
from
different
social
media
platforms
like
Instagram,
Facebook,
YouTube,
etc.
There
are
other
as
well
which
generate
News,
CCTV
videos,
Sports,
Entertainment,
Lengthy
Videos
typically
contain
a
significant
duplicate
occurrences
that
uninteresting
to
the
viewer.
Eliminating
this
unnecessary
information
and
concentrating
only
on
crucial
events
will
be
far
more
advantageous.
This
produces
summary
lengthy
films,
can
save
viewers
time
enable
better
memory
management.
highlights
video
condensed
into
summary.
Video
summarization
is
an
essential
topic
today
since
many
industries
have
cameras
installed
for
various
reasons
such
monitoring,
security,
tracking.
Because
surveillance
videos
taken
24
hours
day,
enormous
amounts
required
if
one
wish
trace
any
incident
or
person
full
day's
video.
Summary
generated
multiple
view
challenging
so
study
advancement
MVS
required.
conceptual
basis
summarizing
approaches
thoroughly
addressed
paper.
paper
addresses
applications,
challenges
Single
Multi
View
summarization.
IEEE Access,
Год журнала:
2023,
Номер
11, С. 10850 - 10863
Опубликована: Янв. 1, 2023
Wireless
capsule
endoscopy
(WCE)
is
a
recently
developed
tool
that
allows
for
the
painless
and
non-invasive
examination
of
entire
gastrointestinal
(GI)
tract.
The
microcamera
captures
large
number
redundant
frames
each
WCE
such
video
summarization
technique
needed
to
assist
in
diagnosis.
However,
prevalent
methods
summarizing
videos
focus
only
on
representativeness
owing
lack
high-level
information
their
importance.
This
paper
develops
Frame
Importance-Assisted
Sparse
Subset
Selection
model,
called
FIAS3,
integrate
frame
importance
from
networks
into
sparse
subset
selection
model.
FIAS3
optimized
under
three
constraints:
1)
matrix
help
pay
more
attention
important
frames,
2)
sparsity
constraint
make
summaries
compact,
3)
similarity-inhibiting
reduce
redundancy.
results
experiments
public
dataset
demonstrated
our
outperforms
other
videos.
Specifically,
its
coverage
reconstruction
error
were
92%
0.143,
respectively,
at
90%
compression
ratio,
recording
respective
least
16.9%
0.031
improvements
over
methods.
generalization
showed
also
achieves
competitive
private
datasets.
2021 2nd International Conference on Range Technology (ICORT),
Год журнала:
2021,
Номер
unknown, С. 1 - 6
Опубликована: Авг. 5, 2021
In
last
decade,
Video
Summarization
(VS)
approach
is
playing
a
pivotal
role
in
the
analysis
of
contents.
The
methodologies
involved
have
wide
range
applications
field
defense
for
video
surveillance,
intrusion,
object
detection,
Browsing,
Content-based
Retrieval
and
Storage
etc.
this
study,
we
proposed
summarization
techniques
to
extract
frames
interest.
Then,
has
determined
by
advanced
texture
descriptors.
Local
Ternary
Pattern
(LTP)
&
Phase
Quantization
(LPQ)
are
descriptor
methods
used
provide
an
efficient
process.
These
conformity
with
elimination
redundant
as
well
maintenance
user
defined
number
distinctive
images.
Then
apply
clustering
process,
which
unsupervised
machine
learning
algorithms,
such
as,
Affinity
Propagation
BIRCH,
utilized
cluster
similar
into
one
group.
confirm
that
summary
denotes
most
input
video,
results
same
importance
preserve
continuousness
summarized
video.
Abstract
The
exponential
increase
in
video
consumption
has
created
new
difficulties
for
browsing
and
navigating
through
more
effectively
efficiently.
Researchers
are
interested
summarization
because
it
offers
a
brief
but
instructive
version
that
helps
users
systems
save
time
effort
when
looking
comprehending
relevant
content.
Key
frame
extraction
is
method
of
only
chooses
the
most
important
frames
from
given
video.
In
this
article,
novel
supervised
learning
‘TC‐CLSTM
Auto
Encoder
with
Mode‐based
Learning’
using
temporal
spatial
features
proposed
automatically
choosing
keyframes
or
sub‐shots
videos.
was
able
to
achieve
an
average
F‐score
84.35
on
TVSum
dataset.
Extensive
tests
benchmark
data
sets
show
suggested
methodology
outperforms
state‐of‐the‐art
methods.
Gastrointestinal
diseases
pose
a
global
health
challenge,
necessitating
prompt
detection
and
precise
categorization
for
effective
treatment.
For
the
first
time,
study
investigated
24
distinct
gastrointestinal
(GI)
problems
across
two
testing
trials
involving
13
different
GI
tract
diseases.
This
research
introduces
novel
Lightweight
Parallel
Depth-Wise
Separable
Convolutional
Neural
Network
(LPDS-CNN),
along
with
Ridge
Regression
Extreme
Learning
Machine
(RRELM)
classifier,
accurate
identification
of
images
from
endoscopy
dataset.
A
hybrid
pre-processing
technique
was
developed
to
enhance
image
quality
minimize
noise,
combining
artefact
removal,
contrast-limited
adaptive
histogram
equalization
(CLAHE),
sharpening,
Gaussian
filtering.
The
LPDS-CNN
effectively
captures
discriminative
features,
retaining
mere
0.498
million
parameters
nine
layers,
significantly
reducing
complexity
during
computations.
Impressively,
proposed
framework
delivers
remarkable
performance
on
various
metrics.
In
trial
(24
classes),
average
precision,
recall,
f1,
accuracy,
ROC-AUC
scores
stand
at
83.42±0.27%,
68.08±0.311%,
72.63±0.275%,
89.13%,
98.11%
respectively.
second
(13
are
even
higher,
91.08±0.062%,
88.15±0.092%,
89.54±0.066%,
92.15%,
98.26%.
is
exceptionally
efficient,
an
training
time
0.0192
0.002
seconds,
Comparative
analysis
state-of-the-art
(SOTA)
transfer
learning
(TL)
methods
validates
model's
real-time
analytical
prowess.
Additionally,
integration
SHAP
(Shapley
Additive
Explanations)
enhances
interpretability,
offering
valuable
insights
confident
real-world
diagnosis.
comprehensive
approach
shows
potential
improve
diagnosis
enable
earlier
treatment
worldwide.
Bulletin of Electrical Engineering and Informatics,
Год журнала:
2023,
Номер
13(1), С. 312 - 319
Опубликована: Дек. 13, 2023
Wireless
capsule
endoscopy
is
one
of
the
diagnostic
methods
used
to
record
video
gastrointestinal
tract.
The
stays
in
digestive
system
for
at
least
eight
hours.
It
difficult
gastroenterologists
examine
such
a
lengthy
and
identify
ailment.
Convolutional
neural
networks
(CNN)
are
powerful
solution
several
computer
vision
problems.
CNN
can
speed
up
reviewing
time
recorded
by
classifying
frames
into
various
categories.
primary
emphasis
this
research
paper
evaluate
performance
three
different
architectures-VGG,
inception,
MobileNet-in
disease.
Experimental
results
demonstrate
that
MobileNetV2’s
accuracy
91%,
whereas
InceptionV3
VGG16
have
an
94%
which
better
than
MobileNetV3.
However,
MobileNeV2
performed
relatively
other
models
terms
computational
cost.
model’s
F-score,
precision,
recall
values
computed
compared
also.
With
the
rise
of
conversation-oriented
streaming
videos,
platforms
that
host
them
like
Twitch
have
rapidly
become
prominent
information
hubs.
However,
lengthy
nature
such
streams
often
deters
viewers
from
consuming
full
content.
To
mitigate
this,
we
propose
AMP-BiLSTM,
a
novel
highlight
extraction
method
which
focuses
on
textual
in
streamer
discourses
and
viewer
responses
rather
than
visual
features.
This
approach
addresses
limitations
previous
methods,
primarily
centered
analyzing
features,
were
thus
insufficient
for
where
highlights
emerge
dialogues
interactions.
AMP-BiLSTM
is
built
techniques
Attention,
Multi-channel,
Position
enrichment
integrated
into
Bidirectional
Long
Short-Term
Memory
(BiLSTM)
network.
Through
experiments
real-world
dataset,
found
messages
provide
significant
utility
videos.
Furthermore,
our
proposed
Multi-channel
self-attention
effectively
distill
text
semantically-rich
embeddings.
The
experiment
results
demonstrate
outperforms
several
state-of-the-art
methods
deep
learning-based
extraction,
showing
promise
improved
video
content
digestion.