Do LLMs consider security? an empirical study on responses to programming questions
Amirali Sajadi,
No information about this author
Binh Le,
No information about this author
Thu Anh Nguyen
No information about this author
et al.
Empirical Software Engineering,
Journal Year:
2025,
Volume and Issue:
30(3)
Published: April 16, 2025
Abstract
The
widespread
adoption
of
conversational
LLMs
for
software
development
has
raised
new
security
concerns
regarding
the
safety
LLM-generated
content.
Our
motivational
study
outlines
ChatGPT’s
potential
in
volunteering
context-specific
information
to
developers,
promoting
safe
coding
practices.
Motivated
by
this
finding,
we
conduct
a
evaluate
degree
awareness
exhibited
three
prominent
LLMs:
Claude
3,
GPT-4,
and
Llama
3.
We
prompt
these
with
Stack
Overflow
questions
that
contain
vulnerable
code
whether
they
merely
provide
answers
or
if
also
warn
users
about
insecure
code,
thereby
demonstrating
awareness.
Further,
assess
LLM
responses
causes,
exploits,
fixes
vulnerability,
help
raise
users’
findings
show
all
models
struggle
accurately
detect
vulnerabilities,
achieving
detection
rate
only
12.6%
40%
across
our
datasets.
observe
tend
identify
certain
types
vulnerabilities
related
sensitive
exposure
improper
input
neutralization
much
more
frequently
than
other
types,
such
as
those
involving
external
control
file
names
paths.
Furthermore,
when
do
issue
warnings,
often
on
compared
responses.
Finally,
an
in-depth
discussion
implications
findings,
demonstrated
CLI-based
prompting
tool
can
be
used
produce
secure
Language: Английский
Improving VulRepair’s Perfect Prediction by Leveraging the LION Optimizer
Brian Kishiyama,
No information about this author
Young Lee,
No information about this author
Jeong Yang
No information about this author
et al.
Published: June 12, 2024
In
many
of
the
current
software
applications,
numerous
vulnerabilities
may
be
present.1
Attackers
attempt
to
exploit
existing
that
lead
security
breaches,
unauthorized
entry,2
data
theft,
or
incapacitation
a
computer
system.
Rather
than
addressing
hardware3
at
later
stage,
it
is
better
address
them
immediately.
DevSecOps,
when
utilized4
in
application
development,
tackles
these
an
early
stage.
AIBughunter
tool5
addresses
this
problem
and
was
developed
by
ASWM
research
group
predict,6
classify,
repair
vulnerabilities.
integrates
LineVul
find
vulnerable7
code
lines
returns
information
about
type
vulnerability
its
severity
developers.8
It
also
includes
tool,
VulRepair,
which
detects
repairs
VulRepair
currently9
predicts
patches
for
vulnerable
functions
44%.
order
become
truly
effective,
number
10
needs
increased.
This
study
examines
see
whether
44%
Perfect
Prediction
11
can
T5
based
model
uses
Natural
Language
Programming
12
Languages
pre-training
along
with
Byte
Pair
Encoding.
outperforms
other
models,
13
such
as
VRepair
CodeBERT.
However,
hyperparameters
not
optimized
due
14
development
new
optimizers.
We
review
Deep
Neural
Network
(DNN)
optimizer
15
Google
2023.
called
Evolved
Sign
Momentum
(LION)
available
PyTorch.
16
applied
tested
influence
on
hyperparameters.
After
adjusting
17
hyperparameters,
we
obtained
56%
Prediction,
exceeds
value
18
report
means
more
avoid
attacks.
As
19
far
know,
our
approach
utilizing
alternative
AdamW,
standard
optimizer,
has
20
been
previously
enhance
similar
models.
21
Language: Английский
Requirements Engineering for Trustworthy Human-AI Synergy in Software Engineering 2.0
David Lo
No information about this author
Published: June 24, 2024
Language: Английский
Improving VulRepair’s Perfect Prediction by Leveraging the LION Optimizer
Applied Sciences,
Journal Year:
2024,
Volume and Issue:
14(13), P. 5750 - 5750
Published: July 1, 2024
In
current
software
applications,
numerous
vulnerabilities
may
be
present.
Attackers
attempt
to
exploit
these
vulnerabilities,
leading
security
breaches,
unauthorized
entry,
data
theft,
or
the
incapacitation
of
computer
systems.
Instead
addressing
hardware
at
a
later
stage,
it
is
better
address
them
immediately
during
development
phase.
Tools
such
as
AIBugHunter
provide
solutions
designed
tackle
issues
by
predicting,
categorizing,
and
fixing
coding
vulnerabilities.
Essentially,
developers
can
see
where
their
code
susceptible
attacks
obtain
details
about
nature
severity
incorporates
VulRepair
detect
repair
currently
predicts
patches
for
vulnerable
functions
44%.
To
truly
effective,
this
number
needs
increased.
This
study
examines
whether
44%
perfect
prediction
based
on
T5
uses
both
natural
language
programming
languages
its
pretraining
phase,
along
with
byte
pair
encoding.
text-to-text
transfer
transformer
model
an
encoder
decoder
part
neural
network.
It
outperforms
other
models
VRepair
CodeBERT.
However,
hyperparameters
not
optimized
due
new
optimizers.
We
reviewed
deep
network
(DNN)
optimizer
developed
Google
in
2023.
optimizer,
Evolved
Sign
Momentum
(LION),
available
PyTorch.
applied
LION
tested
influence
hyperparameters.
After
adjusting
hyperparameters,
we
obtained
56%
prediction,
which
exceeds
value
report
means
that
more
avoid
attacks.
As
far
know,
our
approach
utilizing
alternative
AdamW,
standard
has
been
previously
enhance
similar
models.
Language: Английский
VulAdvisor: Natural Language Suggestion Generation for Software Vulnerability Repair
Published: Oct. 18, 2024
Language: Английский
EffFix: Efficient and Effective Repair of Pointer Manipulating Programs
ACM Transactions on Software Engineering and Methodology,
Journal Year:
2024,
Volume and Issue:
unknown
Published: Nov. 21, 2024
This
work
introduces
EffFix,
a
tool
that
applies
novel
static
analysis-driven
Automated
Program
Repair
(APR)
technique
for
fixing
memory
errors.
APR
tools
typically
rely
on
given
test-suite
to
guide
the
repair
process.
Apart
from
need
provide
test
oracles,
this
reliance
is
also
one
of
main
contributors
over-fitting
problem.
Static
analysis
based
techniques
bypass
these
issues
only
introduce
new
ones,
such
as
soundness,
scalability,
and
generalizability.
demonstrates
how
we
can
overcome
challenges
achieve
sound
bug
at
scale
by
leveraging
(specifically
Incorrectness
Separation
Logic
–
ISL)
repair.
first
approach
use
ISL.
Our
key
insight
abstract
domain
used
detect
bugs
contains
information
derive
correct
patches.
proposed
learns
what
desirable
patch
inspecting
close
feedback
ISL
Pulse
analyzer),
turning
into
distribution
probabilities
over
context
free
grammars.
generic
in
its
learning
strategy
allows
finding
patches
without
relying
commonly
templates.
Furthermore,
efficient
program
repair,
instead
focusing
heuristics
reducing
search
space
patches,
make
scalable
creating
classes
equivalent
according
effect
they
have
symbolic
heap.
We
then
conduct
candidate
validation
once
per
equivalence
class.
EffFix
efficiently
discover
quality
repairs
even
presence
large
pool
candidates.
Experimental
evaluation
real
world
errors
medium
subjects
like
OpenSSL,
Linux
Kernel,
swoole,
shows
efficiency
effectiveness
—
terms
automatically
producing
spaces.
In
particular,
has
fix
ratio
66%
leak
83%
Null
Pointer
Dereferences
considered
dataset.
Language: Английский
Fuzz to the Future: Uncovering Occluded Future Vulnerabilities via Robust Fuzzing
Arvind S. Raj,
No information about this author
W. R. Gibbs,
No information about this author
Fangzhou Dong
No information about this author
et al.
Published: Dec. 2, 2024
The
security
landscape
of
software
systems
has
witnessed
considerable
advancements
through
dynamic
testing
methodologies,
especially
fuzzing.
Traditionally,
fuzzing
involves
a
sequential,
cyclic
process
where
is
tested
to
identify
crashes.
These
crashes
are
then
triaged
and
patched,
leading
subsequent
cycles
that
uncover
further
vulnerabilities.
While
effective,
this
method
not
efficient
as
each
cycle
potentially
reveals
new
issues
previously
obscured
by
earlier
crashes,
thus
resulting
in
vulnerabilities
being
discovered
sequentially.
Language: Английский