Empirical Software Engineering,
Journal Year:
2022,
Volume and Issue:
27(7)
Published: Oct. 1, 2022
Abstract
Background
Developers
spend
more
time
fixing
bugs
refactoring
the
code
to
increase
maintainability
than
developing
new
features.
Researchers
investigated
quality
impact
on
fault-proneness,
focusing
smells
and
metrics.
Objective
We
aim
at
advancing
fault-inducing
commit
prediction
using
different
variables,
such
as
SonarQube
rules,
product,
process
metrics,
adopting
techniques.
Method
designed
conducted
an
empirical
study
among
29
Java
projects
analyzed
with
SZZ
algorithm
identify
fault-fixing
commits,
computing
product
Moreover,
we
fault-proneness
Machine
Deep
Learning
models.
Results
58,125
commits
containing
33,865
faults
infected
by
174
rules
violated
1.8M
times,
which
48
software
metrics
were
calculated.
clearly
identified
a
set
of
features
that
provided
highly
accurate
fault
(more
95%
AUC).
Regarding
performance
classifiers,
higher
accuracy
compared
Conclusion
Future
works
might
investigate
whether
other
static
analysis
tools,
FindBugs
or
Checkstyle,
can
provide
similar
results.
researchers
consider
adoption
series
anomaly
detection
Almost
every
Mining
Software
Repositories
(MSR)
study
requires,
as
first
step,
the
selection
of
subject
software
repositories.
These
repositories
are
usually
collected
from
hosting
services
like
GitHub
using
specific
criteria
dictated
by
goal.
For
example,
a
related
to
licensing
might
be
interested
in
selecting
projects
explicitly
declaring
license.
Once
have
been
defined,
utilities
such
APIs
can
used
"query"
service.
However,
researchers
deal
with
usage
limitations
imposed
these
and
lack
required
information.
search
allow
30
requests
per
minute
and,
when
searching
repositories,
only
provide
limited
information
(e.g.,
number
commits
repository
is
not
included).
To
support
sampling
GitHub,
we
present
GHS
(GitHub
Search),
dataset
containing
25
characteristics
commits,
license,
etc.)
735,669
written
10
programming
languages.
The
set
has
derived
looking
for
frequently
project
MSR
studies
continuously
updated
(i)
always
fresh
data
about
existing
projects,
(ii)
increase
indexed
projects.
queried
through
web
application
built
that
allows
many
combinations
needed
download
matching
repositories:
https://seart-ghs.si.usi.ch.
IEEE Transactions on Software Engineering,
Journal Year:
2022,
Volume and Issue:
49(1), P. 44 - 63
Published: Jan. 6, 2022
Software
vulnerabilities
are
weaknesses
in
source
code
that
can
be
potentially
exploited
to
cause
loss
or
harm.
While
researchers
have
been
devising
a
number
of
methods
deal
with
vulnerabilities,
there
is
still
noticeable
lack
knowledge
on
their
software
engineering
life
cycle,
for
example
how
introduced
and
removed
by
developers.
This
information
design
more
effective
vulnerability
prevention
detection,
as
well
understand
the
granularity
at
which
these
should
aim.
To
investigate
cycle
known
we
focus
how,
when,
under
circumstances
contributions
introduction
projects
made,
long,
they
xmlns:xlink="http://www.w3.org/1999/xlink">removed
.
We
consider
3,663
public
patches
from
National
Vulnerability
Database—pertaining
1,096
open-source
GitHub
—and
define
an
eight-step
process
involving
both
automated
parts
(e.g.,
using
procedure
based
SZZ
algorithm
find
vulnerability-contributing
commits)
manual
analyses
were
fixed).
The
investigated
classified
144
categories,
take
average
least
4
contributing
commits
before
being
introduced,
half
them
remain
unfixed
than
one
year.
Most
xmlns:xlink="http://www.w3.org/1999/xlink">contributions
done
developers
high
workload,
often
when
doing
maintenance
activities,
mostly
addition
new
aiming
implementing
further
checks
inputs.
conclude
distilling
practical
implications
detectors
work
assist
timely
identifying
issues.
ACM Computing Surveys,
Journal Year:
2023,
Volume and Issue:
55(13s), P. 1 - 48
Published: May 13, 2023
The
accuracy
reported
for
code
smell-detecting
tools
varies
depending
on
the
dataset
used
to
evaluate
tools.
Our
survey
of
45
existing
datasets
reveals
that
adequacy
a
detecting
smells
highly
depends
relevant
properties
such
as
size,
severity
level,
project
types,
number
each
type
smell,
smells,
and
ratio
smelly
non-smelly
samples
in
dataset.
Most
support
God
Class,
Long
Method,
Feature
Envy
while
six
Fowler
Beck's
catalog
are
not
supported
by
any
datasets.
We
conclude
suffer
from
imbalanced
samples,
lack
supporting
restriction
Java
language.
IEEE Access,
Journal Year:
2021,
Volume and Issue:
9, P. 8695 - 8707
Published: Jan. 1, 2021
Purpose:
Code
smells
are
residuals
of
technical
debt
induced
by
the
developers.
They
hinder
evolution,
adaptability
and
maintenance
software.
Meanwhile,
they
very
beneficial
in
indicating
loopholes
problems
bugs
Machine
learning
has
been
extensively
used
to
predict
Smells
research.
The
current
study
aims
optimise
prediction
using
Ensemble
Learning
Feature
Selection
techniques
on
three
open-source
Java
data
sets.
Design
Results:
work
Compares
four
varied
approaches
detect
code
performance
measures
Accuracy(P1),
G-mean1
(P2),
G-mean2
(P3),
F-measure
(P4).
found
out
that
values
did
not
degrade
it
instead
either
remained
same
or
increased
with
feature
selection
Learning.
Random
Forest
turns
be
best
classifier
while
Correlation-based
selection(BFS)
is
amongst
techniques.
aggregators,
i.e.
ET5C2
(BFS
intersection
Relief
Forest),
ET6C2
union
ET5C1
Bagging)
Majority
Voting
give
results
from
all
aggregation
combinations
studied.
Conclusion:
Though
good,
but
needs
a
lot
validation
for
variety
sets
before
can
standardised.
also
pose
challenge
concerning
diversity
reliability
hence
exhaustive
studies.
Applied Sciences,
Journal Year:
2025,
Volume and Issue:
15(2), P. 633 - 633
Published: Jan. 10, 2025
The
rapid
expansion
of
software
applications
has
led
to
an
increase
in
the
frequency
bugs,
which
are
typically
reported
through
user-submitted
bug
reports.
Developers
prioritize
these
reports
based
on
severity
and
project
schedules.
However,
manual
process
assigning
priorities
is
time-consuming
prone
inconsistencies.
To
address
limitations,
this
study
presents
a
Priority-Sensitive
LSTM–Attention
mechanism
for
automating
priority
prediction.
proposed
approach
extracts
features
such
as
product
component
details
from
repositories
preprocesses
data
ensure
consistency.
Priority-based
feature
selection
applied
align
input
with
task
prioritization.
These
processed
Long
Short-Term
Memory
(LSTM)
network
capture
sequential
dependencies,
outputs
further
refined
using
Attention
focus
most
relevant
information
effectiveness
model
was
evaluated
datasets
Eclipse
Mozilla
open-source
projects.
Compared
baseline
models
Naïve
Bayes,
Random
Forest,
Decision
Tree,
SVM,
CNN,
LSTM,
CNN-LSTM,
achieved
superior
performance.
It
recorded
accuracy
93.00%
84.11%
Mozilla,
representing
improvements
31.11%
40.39%,
respectively,
over
models.
Statistical
verification
confirmed
that
performance
gains
were
significant.
This
distinguishes
itself
by
integrating
priority-based
hybrid
architecture,
enhances
prediction
robustness
compared
existing
methods.
results
demonstrate
potential
streamline
prioritization,
improve
management
efficiency,
assist
developers
resolving
high-priority
issues.
Software Practice and Experience,
Journal Year:
2025,
Volume and Issue:
unknown
Published: Jan. 13, 2025
ABSTRACT
Context
Code
smells
are
indicators
of
poor
design
and
implementation
choices
that
negatively
affect
software
quality
maintainability.
Moreover,
it
is
difficult
time‐consuming
to
work
with
a
long
list
the
code
smells,
as
not
all
those
have
equal
impact
on
system.
So,
understanding
individual
significant
while
performing
refactorings
priority
basis.
Objective
Despite
research
efforts
aimed
at
detecting
refactoring
these
their
metrics
such
size,
complexity,
coupling,
etc.
remains
still
unclear.
Methodology
To
mitigate
this
gap,
we
present
an
empirical
investigation
analysis
based
25
cyclomatic
best
our
knowledge,
largest
study
about
respect
number
metrics.
Particularly
for
study,
identify
13
in
35
open‐source
systems,
analyze
(1)
relationship
between
metrics,
(2)
which
highly
impactful
(3)
occur
frequently
systems.
Results
The
results
show
varying
degrees
correlation‐based
specific
some
showing
strong
correlations
multiple
Three
categories
been
identified,
namely
High,
Moderate
Low,
where
Long
Method,
Anti
Singleton,
Complex
Class,
Large
Class
Parameter
List
high
impact,
but
frequencies
except
Singleton;
Refused
Parent
Bequest,
Spaghetti
Blob
moderate
impact;
rest
low
impact.
We
also
observe
perceptions
vary
from
developer
they
most
cases
refactor
intuition.
Conclusion
Our
findings
will
help
them
objective‐based
instead
intuition‐based,
be
more
improve
quality.
For
example,
having
coupling
objects
metric
can
objective.
Furthermore,
only
assist
developers
prioritizing
activities
provide
researchers
valuable
insights
innovate
tools
prioritize
smells.
These
target
thus
enhance
overall
maintainability
Code
review
plays
an
important
role
in
software
quality
control.
A
typical
process
would
involve
a
careful
check
of
piece
code
attempt
to
find
defects
and
other
issues/violations.
One
type
issues
that
may
impact
the
is
smells
-
i.e.,
bad
programming
practices
lead
or
maintenance
issues.
Yet,
little
known
about
extent
which
are
identified
during
reviews.
To
investigate
concept
behind
reviews
what
actions
reviewers
suggest
developers
take
response
smells,
we
conducted
empirical
study
using
two
most
active
OpenStack
projects
(Nova
Neutron).
We
manually
checked
19,146
comments
obtained
by
keywords
search
random
selection,
got
1,190
smell-related
causes
taken
against
smells.
Our
analysis
found
1)
were
not
commonly
reviews,
2)
usually
caused
violation
coding
conventions,
3)
provided
constructive
feedback,
including
fixing
(refactoring)
recommendations
help
remove
4)
generally
followed
those
actioned
changes.
results
should
closely
follow
conventions
their
avoid
introducing
review-based
detection
perceived
be
trustworthy
approach
developers,
mainly
because
context-sensitive
(as
more
aware
context
given
they
part
project's
development
team).