Hardware Acceleration for Object Detection using YOLOv5 Deep Learning Algorithm on Xilinx Zynq FPGA Platform
Taoufik Saidani,
No information about this author
Refka Ghodhbani,
No information about this author
Ahmed Alhomoud
No information about this author
et al.
Engineering Technology & Applied Science Research,
Journal Year:
2024,
Volume and Issue:
14(1), P. 13066 - 13071
Published: Feb. 8, 2024
Object
recognition
presents
considerable
difficulties
within
the
domain
of
computer
vision.
Field-Programmable
Gate
Arrays
(FPGAs)
offer
a
flexible
hardware
platform,
having
exceptional
computing
capabilities
due
to
their
adaptable
topologies,
enabling
highly
parallel,
high-performance,
and
diverse
operations
that
allow
for
customized
reconfiguration
integrated
circuits
enhance
effectiveness
object
detection
accelerators.
However,
there
is
scarcity
assessments
comprehensive
analysis
FPGA-based
accelerators,
currently
no
framework
enable
specifically
tailored
unique
characteristics
FPGA
technology.
The
You
Only
Look
Once
(YOLO)
algorithm
an
innovative
method
combines
speed
accuracy
in
detection.
This
study
implemented
YOLOv5
on
Xilinx®
Zynq-7000
System
Chip
(SoC)
perform
real-time
Using
MS-COCO
dataset,
proposed
showed
improvement
resource
utilization
with
approximately
42
thousand
(78%)
look-up
tables,
56
(52%)
flip-flops,
65
(46%)
BRAMs,
19
(9%)
DSPs
at
frequency
250
MHz,
improving
compared
previous
simulated
results.
Language: Английский
Fall Detection of the Elderly Using Denoising LSTM-Based Convolutional Variant Autoencoder
IEEE Sensors Journal,
Journal Year:
2024,
Volume and Issue:
24(11), P. 18556 - 18567
Published: April 25, 2024
As
societies
age,
the
issue
of
falls
has
become
increasingly
critical
for
health
and
safety
elderly.
Fall
detection
in
elderly
traditionally
relied
on
supervised
learning
methods,
which
require
data
falls,
is
difficult
to
obtain
real
situations.
Additionally,
complexity
integrating
deep
models
into
wearable
devices
real-time
fall
been
challenging
due
limited
computational
resources.
In
this
paper,
we
propose
a
novel
method
using
unsupervised
based
denoising
long
short
term
memory
(LSTM)-based
convolutional
variational
autoencoder
(CVAE)
model
solve
problem
lack
data.
By
utilizing
proposed
debugging
hierarchical
balancing
techniques,
achieves
an
F1
score
1.0
while
reducing
parameter
count
by
25.6
times
compared
state-of-the-art
method.
The
resulting
occupies
only
157.65
KB
memory,
making
it
highly
suitable
integration
devices.
Language: Английский
An Optimised CNN Hardware Accelerator Applicable to IoT End Nodes for Disruptive Healthcare
Arfan Ghani,
No information about this author
Akinyemi Aina,
No information about this author
Chan Hwang See
No information about this author
et al.
IoT,
Journal Year:
2024,
Volume and Issue:
5(4), P. 901 - 921
Published: Dec. 6, 2024
In
the
evolving
landscape
of
computer
vision,
integration
machine
learning
algorithms
with
cutting-edge
hardware
platforms
is
increasingly
pivotal,
especially
in
context
disruptive
healthcare
systems.
This
study
introduces
an
optimized
implementation
a
Convolutional
Neural
Network
(CNN)
on
Basys3
FPGA,
designed
specifically
for
accelerating
classification
cytotoxicity
human
kidney
cells.
Addressing
challenges
posed
by
constrained
dataset
sizes,
compute-intensive
AI
algorithms,
and
limitations,
approach
presented
this
paper
leverages
efficient
image
augmentation
pre-processing
techniques
to
enhance
both
prediction
accuracy
training
efficiency.
The
CNN,
quantized
8-bit
precision
tailored
FPGA’s
resource
constraints,
significantly
accelerates
factor
three
while
consuming
only
1.33%
power
compared
traditional
software-based
CNN
running
NVIDIA
K80
GPU.
network
architecture,
composed
seven
layers
excessive
hyperparameters,
processes
downscale
grayscale
images,
achieving
notable
gains
speed
energy
A
cornerstone
our
methodology
emphasis
parallel
processing,
data
type
optimization,
reduced
logic
space
usage
through
integer
operations.
We
conducted
extensive
pre-processing,
including
histogram
equalization
artefact
removal,
maximize
feature
extraction
from
augmented
dataset.
Achieving
approximately
91%
unseen
FPGA-implemented
demonstrates
potential
rapid,
low-power
medical
diagnostics
within
broader
IoT
ecosystem
where
could
be
assessed
online.
work
underscores
feasibility
deploying
resource-efficient
models
environments
high-performance
computing
resources
are
unavailable,
typically
settings,
paving
way
contributing
advanced
vision
embedded
Language: Английский
Review of Energy-Efficient Embedded System Acceleration of Convolution Neural Networks for Organic Weeding Robots
Agriculture,
Journal Year:
2023,
Volume and Issue:
13(11), P. 2103 - 2103
Published: Nov. 6, 2023
The
sustainable
cultivation
of
organic
vegetables
and
the
associated
problem
weed
control
has
been
a
current
research
topic
for
some
time.
Despite
this,
use
chemical
synthetic
pesticides
increases
every
year.
This
is
to
be
solved
with
help
an
automated
robot
system.
version
weeding
uses
GPUs
execute
inference
phase.
requires
lot
energy
8-track
robot.
To
enable
autonomous
solar
operation,
system
must
made
more
efficient.
work
aims
evaluate
possible
approaches
state
on
implementing
convolution
neural
networks
low
power
embedded
systems.
In
course
work,
technical
feasibility
implementation
CNNs
in
FPGAs
was
examined,
particular,
following
example
analysis.
paper
shows
that
acceleration
using
technically
feasible
as
detection
hardware
With
existing
literature,
optimization
possibilities
software
have
evaluated.
trials
different
accelerators
diverse
were
investigated
compared.
Language: Английский
Efficient Real time Zynq 7000 FPGA deployment of optimized YOLOv2 deep leaning model for target detection, based on HDL Coder Methodology
International Journal on Information Technologies and Security,
Journal Year:
2024,
Volume and Issue:
16(2), P. 15 - 26
Published: June 1, 2024
Field
Programmable
Gate
Arrays
(FPGAs)
have
garnered
significant
attention
in
the
development
and
enhancement
of
target
identification
algorithms
that
employ
YOLOv2
models
FPGAs,
owing
to
their
adaptability
user-friendliness.
The
Simulink
HDL
compiler
was
utilized
design,
simulate,
implement
our
proposed
design.
In
an
effort
rectify
this,
this
paper
presents
a
comprehensive
programming
design
proposal.
implementation
algorithm
for
real-time
vehicle
detection
on
Xilinx®
Zynq-7000
System-on-a-chip
is
work.
Real-time
testing
synthesised
hardware
revealed
it
can
process
Full
HD
video
at
rate
16
frames
per
second.
On
Xilinx
SOC,
estimated
dynamic
power
consumption
less
than
90
mW.
When
comparing
results
work
those
other
simulations,
observed
resource
utilization
enhanced
by
around
204
k
(75%)
LUT,
305
(12%)
DSP,
224
(41%)
flip-flops
200
MHz.
Language: Английский
Hardware Implementation of a Deep Learning-based Autonomous System for Smart Homes using Field Programmable Gate Array Technology
Mohamed Tounsi,
No information about this author
Ali Jafer Mahdi,
No information about this author
Mahmood Anees Ahmed
No information about this author
et al.
Engineering Technology & Applied Science Research,
Journal Year:
2024,
Volume and Issue:
14(5), P. 17203 - 17208
Published: Oct. 9, 2024
The
current
study
uses
Field-Programmable
Gate
Array
(FPGA)
hardware
to
advance
smart
home
technology
through
a
self-learning
system.
proposed
intelligent
three-hidden
layer
system
outperformed
prior
systems
with
99.21%
accuracy
using
real-world
data
from
the
MavPad
dataset.
research
shows
that
FPGA
solutions
can
do
difficult
computations
in
seconds.
also
examines
difficulties
of
maximizing
performance
limited
resources
when
incorporating
deep
learning
technologies
into
FPGAs.
Despite
these
challenges,
FPGA-based
improve
technology.
It
promotes
integration
sophisticated
algorithms
ordinary
electronics
boost
their
intelligence.
Language: Английский
Structural-Parametric Synthesis of the Geometric Computer Interface
Published: Jan. 1, 2023
The
article
is
devoted
to
the
consideration
of
a
number
possible
structural
and
parametric
compositions
that
together
can
form
an
automated
geometric
design
tool
designed
solve
geometric,
engineering
pedagogical
problems.
conceptual
apparatus
work
based
on
works
St.
Petersburg
school.
Modeling
operation
process
machine
in
constructive
diagram
allows
you
visualize
stages
its
work,
starting
with
obtaining
information
from
object
ending
construction
model.
An
analysis
existing
has
revealed
three
main
areas
field
processing
data:
using
FPGA,
GPU
or
microcontrollers.
implementation
shown
structures
analytical
model
high-level
programming
language
python
made
it
choose
most
suitable
them
for
first
iteration
computer
plan
further
steps
modernization.
Language: Английский