Healthcare Technology Letters,
Год журнала:
2024,
Номер
11(6), С. 485 - 495
Опубликована: Ноя. 22, 2024
Abstract
This
paper
develops
a
method
for
cancer
classification
from
microRNA
data
using
convolutional
neural
network
(CNN)‐based
model
optimized
by
genetic
algorithm.
The
has
performed
well
in
various
recognition
and
perception
tasks.
contributes
to
the
union
of
two
CNNs.
method's
performance
is
boosted
relationship
between
CNNs
exchanging
knowledge
them.
Besides,
communication
small
sizes
reduces
need
large
size
and,
consequently,
computational
time
memory
usage
while
preserving
high
accuracy.
proposed
tested
on
dataset
containing
genomic
information
8129
patients
29
different
types
with
1046
gene
expression.
accuracy
selected
genes
obtained
approach
compared
22
well‐known
classifiers
real‐world
dataset.
each
type
also
ranked
results
77
reported
previous
works.
shows
100%
24
out
classes
seven
cases
29,
achieved
that
no
classifier
other
studies
reached.
Performance
analysis
metrics.
International Journal of Intelligent Computing and Cybernetics,
Год журнала:
2024,
Номер
18(1), С. 133 - 152
Опубликована: Ноя. 13, 2024
Purpose
Vision
transformers
(ViT)
detectors
excel
in
processing
natural
images.
However,
when
remote
sensing
images
(RSIs),
ViT
methods
generally
exhibit
inferior
accuracy
compared
to
approaches
based
on
convolutional
neural
networks
(CNNs).
Recently,
researchers
have
proposed
various
structural
optimization
strategies
enhance
the
performance
of
detectors,
but
progress
has
been
insignificant.
We
contend
that
frequent
scarcity
RSI
samples
is
primary
cause
this
problem,
and
model
modifications
alone
cannot
solve
it.
Design/methodology/approach
To
address
this,
we
introduce
a
faster
RCNN-based
approach,
termed
QAGA-Net,
which
significantly
enhances
recognition.
Initially,
propose
novel
quantitative
augmentation
learning
(QAL)
strategy
sparse
data
distribution
RSIs.
This
integrated
as
QAL
module,
plug-and-play
component
active
exclusively
during
model’s
training
phase.
Subsequently,
enhanced
feature
pyramid
network
(FPN)
by
introducing
two
efficient
modules:
global
attention
(GA)
module
long-range
dependencies
multi-scale
information
fusion,
an
pooling
(EP)
optimize
capability
understand
both
high
low
frequency
information.
Importantly,
QAGA-Net
compact
size
achieves
balance
between
computational
efficiency
accuracy.
Findings
verified
using
different
models
detector’s
backbone.
Extensive
experiments
NWPU-10
DIOR20
datasets
demonstrate
superior
23
other
or
CNN
literature.
Specifically,
shows
increase
mAP
2.1%
2.6%
challenging
dataset
top-ranked
respectively.
Originality/value
paper
highlights
impact
detection
performance.
fundamentally
data-driven
approach:
module.
Additionally,
introduced
modules
FPN.
More
importantly,
our
potential
collaborate
with
method
does
not
require
any
The Photogrammetric Record,
Год журнала:
2025,
Номер
40(189)
Опубликована: Янв. 1, 2025
ABSTRACT
Existing
Vision
Transformer
(ViT)‐based
object
detection
methods
for
remote
sensing
images
(RSIs)
face
significant
challenges
due
to
the
scarcity
of
RSI
samples
and
over‐reliance
on
enhancement
strategies
originally
developed
natural
images.
This
often
leads
inconsistent
data
distributions
between
training
testing
subsets,
resulting
in
degraded
model
performance.
In
this
study,
we
introduce
an
optimized
distribution
learning
(ODDL)
strategy
develop
framework
based
Faster
R‐CNN
architecture,
named
ODDL‐Net.
The
ODDL
begins
with
augmentation
(OA)
technique,
overcoming
limitations
conventional
methods.
Next,
propose
mosaic
algorithm
(OMA),
improving
upon
shortcomings
traditional
Mosaic
techniques.
Additionally,
a
feature
fusion
regularization
(FFR)
method,
addressing
inherent
classic
pyramid
networks.
These
innovations
are
integrated
into
three
modular,
plug‐and‐play
components—namely,
OA,
OMA,
FFR
modules—ensuring
that
can
be
seamlessly
incorporated
existing
frameworks
without
requiring
modifications.
To
evaluate
effectiveness
proposed
ODDL‐Net,
two
variants
different
ViT
architectures:
Next
(NViT)
small
Swin
(SwinT)
tiny
model,
both
used
as
backbones.
Experimental
results
NWPU10,
DIOR20,
MAR20,
GLH‐Bridge
datasets
demonstrate
ODDL‐Net
achieve
impressive
accuracy,
surpassing
23
state‐of‐the‐art
introduced
since
2023.
Specifically,
ODDL‐Net‐NViT
attained
accuracies
78.3%
challenging
DIOR20
dataset
61.4%
dataset.
Notably,
represents
substantial
improvement
approximately
23%
over
R‐CNN‐ResNet50
baseline
conclusion,
study
demonstrates
ViTs
well
suited
high‐accuracy
RSIs.
Furthermore,
it
provides
straightforward
solution
building
ViT‐based
detectors,
offering
practical
approach
requires
little
modification.
Scientific Reports,
Год журнала:
2025,
Номер
15(1)
Опубликована: Март 18, 2025
Detecting
small
objects
in
complex
remote
sensing
environments
presents
significant
challenges,
including
insufficient
extraction
of
local
spatial
information,
rigid
feature
fusion,
and
limited
global
representation.
In
addition,
improving
model
performance
requires
a
delicate
balance
between
accuracy
managing
computational
complexity.
To
address
these
we
propose
the
SMA-YOLO
algorithm.
First,
introduce
Non-Semantic
Sparse
Attention
(NSSA)
mechanism
backbone
network,
which
efficiently
extracts
non-semantic
features
related
to
task,
thus
model's
sensitivity
objects.
throat,
design
Bidirectional
Multi-Branch
Auxiliary
Feature
Pyramid
Network
(BIMA-FPN),
integrates
high-level
semantic
information
with
low-level
details,
object
detection
while
expanding
multi-scale
receptive
fields.
Finally,
incorporate
Channel-Space
Fusion
Adaptive
Head
(CSFA-Head),
fully
handles
adaptively
consistency
problems
different
scales,
further
robustness
scenarios.
Experimental
results
on
VisDrone2019
dataset
show
that
achieves
13%
improvement
mAP
compared
baseline
model,
demonstrating
exceptional
adaptability
tasks
for
imagery.
These
provide
valuable
insights
new
approaches
advance
research
this
area.
Annals of Emerging Technologies in Computing,
Год журнала:
2024,
Номер
8(4), С. 56 - 76
Опубликована: Окт. 1, 2024
Vision
Transformers
(ViTs)
have
demonstrated
exceptional
accuracy
in
classifying
remote
sensing
images
(RSIs).
However,
existing
knowledge
distillation
(KD)
methods
for
transferring
representations
from
a
large
ViT
to
more
compact
Convolutional
Neural
Network
(CNN)
proven
ineffective.
This
limitation
significantly
hampers
the
remarkable
generalization
capability
of
ViTs
during
deployment
due
their
substantial
size.
Contrary
common
beliefs,
we
argue
that
domain
discrepancies
along
with
RSI
inherent
natures
constrain
effectiveness
and
efficiency
cross-modal
transfer.
Consequently,
propose
novel
Variance
Consistency
Learning
(VCL)
strategy
enhance
KD
process,
implemented
through
plug-and-plug
module
within
ViTteachingCNN
pipeline.
We
evaluated
our
student
model,
termed
VCL-Net,
on
three
datasets.
The
results
reveal
VCL-Net
exhibits
superior
size
compared
33
other
state-of-the-art
published
past
years.
Specifically,
surpasses
KD-based
maximum
improvement
22%
across
different
Furthermore,
visualization
analysis
model
activations
reveals
has
learned
long-range
dependencies
features
teacher.
Moreover,
ablation
experiments
suggest
method
reduced
time
costs
process
by
at
least
75%.
Therefore,
study
offers
effective
efficient
approach
transfer
when
addressing
discrepancies.
In
the
context
of
intelligent
agriculture,
tomato
cultivation
involves
complex
environments,
where
leaf
occlusion
and
small
disease
areas
significantly
impede
performance
detection
models.
To
address
these
challenges,
this
study
proposes
an
efficient
Tomato
Disease
Detection
Network
(E-TomatoDet),
which
enhances
effectiveness
by
integrating
amplifying
global
local
feature
perception
capabilities.
First,
CSWinTransformer
(CSWinT)
is
integrated
into
backbone
network,
substantially
improving
diseases'
feature-capturing
capacity.
Second,
a
Comprehensive
Multi-Kernel
Module
(CMKM)
designed
to
effectively
incorporate
large,
medium,
capturing
branches
learn
multi-scale
features
diseases.
Moreover,
Local
Feature
Enhance
Pyramid
(LFEP)
neck
network
developed
based
on
CMKM
module,
integrates
across
different
layers
acquire
more
comprehensive
diseases,
thereby
targets
at
various
scales
under
backgrounds.
Finally,
proposed
model's
was
validated
two
datasets.
Notably,
dataset,
E-TomatoDet
improved
mean
Average
Precision
(mAP50)
4.7%
compared
baseline
model,
reaching
97.2%
surpassing
advanced
real-time
YOLOv10s.
This
research
provides
effective
solution
for
efficiently
detecting
vegetable
pests
issues.
Scientific Reports,
Год журнала:
2025,
Номер
15(1)
Опубликована: Апрель 2, 2025
As
an
emerging
State
Space
Model
(SSM),
the
Mamba
model
draws
inspiration
from
architecture
of
Recurrent
Neural
Networks
(RNNs),
significantly
enhancing
global
receptive
field
and
feature
extraction
capabilities
object
detection
models.
Compared
to
traditional
Convolutional
(CNNs)
Transformers,
demonstrates
superior
performance
in
handling
complex
scale
variations
multi-view
interference,
making
it
particularly
suitable
for
tasks
dynamic
environments
such
as
fire
scenarios.
To
enhance
visual
technologies
provide
a
novel
approach,
this
paper
proposes
efficient
algorithm
based
on
YOLOv9
introduces
multiple
key
techniques
design
high-performance
leveraging
attention
mechanism.
First,
presents
mechanism,
Efficient
Attention
(EMA)
module.
Unlike
existing
self-attention
mechanisms,
EMA
integrates
adaptive
average
pooling
with
SSM
module,
eliminating
need
full-scale
association
computations
across
all
positions.
Instead,
performs
dimensionality
reduction
input
features
through
utilizes
state
update
mechanism
module
representation
optimize
information
flow.
Second,
address
limitations
models
local
modeling,
study
incorporates
ConvNeXtV2
backbone
network,
improving
model's
ability
capture
fine-grained
details
thereby
strengthening
its
overall
capability.
Additionally,
non-monotonic
focusing
distance
penalty
strategy
are
employed
refine
loss
function,
leading
substantial
improvement
bounding
box
accuracy.
Experimental
results
demonstrate
proposed
method
tasks.
The
achieves
FPS
71,
[Formula:
see
text]
91.0%
large-scale
dataset
87.2%
small-scale
dataset.
methods,
approach
maintains
high
while
exhibiting
significant
computational
efficiency
advantages.
IET Cyber-Systems and Robotics,
Год журнала:
2025,
Номер
7(1)
Опубликована: Янв. 1, 2025
ABSTRACT
Spike
transformers
cannot
be
pretrained
due
to
objective
factors
such
as
lack
of
datasets
and
memory
constraints,
which
results
in
a
significant
performance
gap
compared
artificial
neural
networks
(ANNs),
thereby
hindering
their
practical
applicability.
To
address
this
issue,
we
propose
hybrid
attention
spike
transformer
that
utilises
self‐attention
with
compound
tokens
channel
attention‐based
token
processing
better
capture
the
inductive
biases
data.
We
also
add
convolution
patch
splitting
feedforward
networks,
not
only
provides
local
information
but
leverages
translation
invariance
locality
convolutions
help
model
converge.
Experiments
on
static
neuromorphic
demonstrate
our
method
achieves
state‐of‐the‐art
spiking
(SNNs)
field.
Notably,
achieve
top‐1
accuracy
80.59%
CIFAR‐100
4
time
steps.
As
far
know,
it
is
first
exploration
multiattention
fusion,
achieving
outstanding
effectiveness.