The
recognition
of
Application
Programming
Interface
(API)
mentions
in
the
software-related
texts
is
a
prerequisite
task
for
extracting
API-related
knowledge.
Previous
studies
have
demonstrated
superiority
deep
learning-based
methods
accomplishing
this
task.
However,
such
techniques
still
meet
their
bottlenecks
due
to
inability
effectively
handle
following
three
challenges:
(1)
differentiating
APIs
from
common
words;
(2)
identifying
morphological
variants
standard
APIs;
and
(3)
lack
high-quality
labeled
data
training.
To
overcome
these
challenges,
paper
proposes
context-aware
API
method
named
CAREER.
This
approach
utilizes
two
key
components,
namely
Bidirectional
Encoder
Representations
Transformers
(BERT)
Bi-directional
Long
Short-Term
Memory
(BiLSTM),
extract
context
information
at
both
word-level
sequence-level.
strategic
combination
empowers
dynamically
capture
syntactic
semantic
information,
addressing
first
challenge.
tackle
second
challenge,
CAREER
introduces
character-level
BiLSTM
component,
enriched
with
an
attention
mechanism.
enables
model
grasp
global
thereby
enhancing
attributes
within
mentions.
Furthermore,
address
third
augmentation
aimed
generating
new
samples.
Accompanying
novel
sample
selection
algorithm
designed
screen
out
instances.
dual-pronged
mitigates
requirement
labeling.
Experiments
demonstrate
that
significantly
improves
F1-score
by
11.0%
compared
state-of-the-art
methods.
We
also
construct
specific
datasets
assess
CAREER's
capacity
aforementioned
challenges.
Results
confirm
outperforms
baseline
aid
algorithms,
samples
can
be
generated
improve
performance,
alleviate
ACM Transactions on Software Engineering and Methodology,
Journal Year:
2023,
Volume and Issue:
33(2), P. 1 - 69
Published: Nov. 6, 2023
Automated
program
repair
(APR)
aims
to
fix
software
bugs
automatically
and
plays
a
crucial
role
in
development
maintenance.
With
the
recent
advances
deep
learning
(DL),
an
increasing
number
of
APR
techniques
have
been
proposed
leverage
neural
networks
learn
bug-fixing
patterns
from
massive
open-source
code
repositories.
Such
learning-based
usually
treat
as
machine
translation
(NMT)
task,
where
buggy
snippets
(i.e.,
source
language)
are
translated
into
fixed
target
automatically.
Benefiting
powerful
capability
DL
hidden
relationships
previous
datasets,
achieved
remarkable
performance.
In
this
article,
we
provide
systematic
survey
summarize
current
state-of-the-art
research
community.
We
illustrate
general
workflow
detail
components,
including
fault
localization,
patch
generation,
ranking,
validation,
correctness
phases.
then
discuss
widely
adopted
datasets
evaluation
metrics
outline
existing
empirical
studies.
several
critical
aspects
techniques,
such
domains,
industrial
deployment,
open
science
issue.
highlight
practical
guidelines
on
applying
for
future
studies,
exploring
explainable
generation
utilizing
features.
Overall,
our
article
can
help
researchers
gain
comprehensive
understanding
about
achievements
promote
application
these
techniques.
Our
artifacts
publicly
available
at
repository:
https://github.com/iSEngLab/AwesomeLearningAPR
.
Software
is
constantly
changing,
requiring
developers
to
perform
several
derived
tasks
in
a
timely
manner,
such
as
writing
description
for
the
intention
of
code
change,
or
identifying
defect-prone
changes.
Considering
that
cost
dealing
with
these
can
account
large
proportion
(typically
around
70
percent)
total
development
expenditure,
automating
processes
will
significantly
lighten
burdens
developers.
To
achieve
target,
existing
approaches
mainly
rely
on
training
deep
learning
models
from
scratch
fine-tuning
pre-trained
tasks,
both
which
have
weaknesses.
Specifically,
former
uses
comparatively
small-scale
labelled
data
training,
making
it
difficult
learn
and
exploit
domain
knowledge
programming
language
hidden
large-amount
unlabelled
wild;
latter
hard
fully
leverage
learned
model,
are
designed
encode
single
snippet
rather
than
change
(the
difference
between
two
snippets).
We
propose
pre-train
model
specially
changes
better
support
software
maintenance.
this
end,
we
first
collect
large-scale
dataset
containing
1.5M+
pairwise
commit
messages.
Based
data,
curate
five
different
pre-training,
equip
diverse
about
fine-tune
CCT5,
three
widely-studied
incurred
by
specific
review
process.
Results
show
CCT5
outperforms
conventional
tasks.
Representing
code
changes
as
numeric
feature
vectors,
i.e.,
change
representations,
is
usually
an
essential
step
to
automate
many
software
engineering
tasks
related
changes,
e.g.,
commit
message
generation
and
just-in-time
defect
prediction.
Intuitively,
the
quality
of
representations
crucial
for
effectiveness
automated
approaches.
Prior
work
on
designs
evaluates
representation
approaches
a
specific
task,
little
has
investigated
encoders
that
can
be
used
jointly
trained
various
tasks.
To
fill
this
gap,
proposes
novel
Code
Change
Representation
learning
approach
named
CCRep,
which
learn
encode
vectors
diverse
downstream
Specifically,
CCRep
regards
combination
its
before-change
after-change
code,
leverages
pre-trained
model
obtain
high-quality
contextual
embeddings
uses
mechanism
query
back
extract
changed
fragments
make
them
explicitly
interact
with
whole
change.
evaluate
demonstrate
applicability
code-change-related
tasks,
we
apply
it
three
tasks:
generation,
patch
correctness
assessment,
Experimental
results
show
outperforms
state-of-the-art
techniques
each
task.
IEEE Transactions on Software Engineering,
Journal Year:
2024,
Volume and Issue:
50(3), P. 474 - 494
Published: Jan. 17, 2024
Automated
program
repair
(APR)
aims
to
fix
software
bugs
automatically
without
human
debugging
efforts
and
plays
a
crucial
role
in
development
maintenance.
Despite
the
recent
significant
progress
number
of
fixed
bugs,
APR
is
still
challenged
by
long-standing
overfitting
problem
(i.e.,
generated
patch
plausible
but
overfitting).
Various
techniques
have
thus
been
proposed
address
problem.
Recently,
researchers
employed
BERT
extract
code
features,
which
are
then
used
train
classifier
for
correctness
prediction,
indicating
potential
such
pre-trained
models
reasoning
about
correctness.
However,
restricted
feature
extraction
training
benefiting
from
process,
potentially
generating
sub-optimal
vector
representations
patched
snippets.
In
this
paper,
we
propose
APPT,
model-based
automated
assessment
technique
both
pre-training
fine-tuning.
APPT
adopts
model
as
encoder
stack,
followed
an
LSTM
stack
deep
learning
classifier.
More
importantly,
fine-tuned
conjunction
with
other
components
whole
pipeline
fully
adapt
it
specifically
Although
our
idea
general
can
be
built
on
various
existing
models,
implemented
based
model.
We
conduct
extensive
experiment
1,183
Defects4J
patches
experimental
results
show
that
achieves
prediction
accuracy
79.7%
recall
83.2%,
outperforming
state-of-the-art
CACHE
4.3%
6.7%.
Our
additional
investigation
49,694
real-world
shows
optimum
performance
(exceeding
99%
five
common
metrics
assessing
classification
techniques)
compared
representation
techniques.
further
investigate
impact
each
component
find
they
all
positively
contribute
e.g.,
fine-tuning
process
increase
F1-score
10.22%
4.11%,
respectively.
also
prove
adopting
advanced
provide
substantial
advancement
(e.g.,
GraphCodeBERT-based
improves
BERT-based
2.8%
3.3%
precision
AUC,
respectively),
highlighting
generalizability
APPT.
Overall,
study
highlights
promising
future
assess
reduce
manual
inspection
effort
experts
when
deploying
tools
practice.
Journal of Systems and Software,
Journal Year:
2023,
Volume and Issue:
209, P. 111934 - 111934
Published: Dec. 19, 2023
The
advancements
in
machine
learning
techniques
have
encouraged
researchers
to
apply
these
a
myriad
of
software
engineering
tasks
that
use
source
code
analysis,
such
as
testing
and
vulnerability
detection.
Such
large
number
studies
hinders
the
community
from
understanding
current
research
landscape.
This
paper
aims
summarize
knowledge
applied
for
analysis.
We
review
belonging
twelve
categories
corresponding
techniques,
tools,
datasets
been
solve
them.
To
do
so,
we
conducted
an
extensive
literature
search
identified
494
studies.
our
observations
findings
with
help
Our
suggest
analysis
is
consistently
increasing.
synthesize
commonly
used
steps
overall
workflow
each
task
employed.
identify
comprehensive
list
available
tools
useable
this
context.
Finally,
discusses
perceived
challenges
area,
including
availability
standard
datasets,
reproducibility
replicability,
hardware
resources.
Editor's
note:
Open
Science
material
was
validated
by
Journal
Systems
Software
Board.
ACM Transactions on Software Engineering and Methodology,
Journal Year:
2025,
Volume and Issue:
unknown
Published: Jan. 23, 2025
WebAssembly
(abbreviated
as
Wasm)
was
initially
introduced
for
the
Web
and
quickly
extended
its
reach
into
various
domains
beyond
Web.
To
create
Wasm
applications,
developers
can
compile
high-level
programming
languages
binaries
or
manually
write
textual
format
of
translate
it
by
toolchain.
Regardless
whether
is
utilized
within
outside
Web,
execution
supported
runtime.
Such
a
runtime
provides
secure,
memory-efficient,
sandboxed
environment
to
execute
binaries.
This
paper
comprehensive
survey
research
on
runtimes
with
103
collected
papers
related
following
traditional
systematic
literature
review
process.
It
characterizes
existing
studies
from
two
different
angles,
including
internal
(Wasm
design,
testing,
analysis)
external
(applying
domains).
also
proposes
future
directions
about
runtimes.
ACM Transactions on Software Engineering and Methodology,
Journal Year:
2025,
Volume and Issue:
unknown
Published: May 1, 2025
Software
systems
have
been
evolving
rapidly
and
inevitably
introducing
bugs
at
an
increasing
rate,
leading
to
significant
maintenance
costs.
While
large
language
models
(LLMs)
demonstrated
remarkable
potential
in
enhancing
software
development
practices,
particularly
automated
program
repair
(APR),
they
rely
heavily
on
high-quality
code
repositories.
Most
repositories
are
proprietary
assets
that
capture
the
diversity
nuances
of
real-world
industry
which
public
datasets
cannot
fully
represent.
However,
obtaining
such
data
from
various
industries
is
hindered
by
privacy
concerns,
as
companies
reluctant
share
their
codebases.
There
has
also
no
in-depth
investigation
collaborative
learning
private
decentralized
while
preserving
for
repair.
To
address
gap,
we
investigate
federated
a
privacy-preserving
method
fine-tuning
LLMs
boost
maintenance.
We
use
industrial
dataset
TutorCode
EvalRepair-Java
benchmark
evaluation,
assess
whether
enhances
then
further
explore
how
heterogeneity
(i.e.,
variations
coding
style,
complexity,
embedding)
different
algorithms
affect
bug
fixing
provide
practical
implications
collaboration.
Our
evaluation
reveals
can
significantly
enhance
repair,
achieving
increases
up
16.67%
Top@10
18.44%
Pass@10,
even
comparable
bug-fixing
capabilities
centralized
learning.
Moreover,
negligible
impact
implies
effectively
collaborate
despite
diverse
distributions.
Different
demonstrate
unique
strengths
across
LLMs,
suggesting
tailoring
optimization
process
specific
LLM
characteristics
improve
Journal of Software Evolution and Process,
Journal Year:
2025,
Volume and Issue:
37(2)
Published: Feb. 1, 2025
ABSTRACT
Patches
can
help
fix
security
vulnerabilities
and
optimize
software
performance,
thereby
enhancing
the
quality
of
software.
Unfortunately,
patches
generated
by
automated
program
repair
tools
are
not
always
correct,
as
they
may
introduce
new
bugs
or
fail
to
fully
rectify
original
issue.
Various
methods
for
evaluating
patch
correctness
have
been
proposed.
However,
most
face
challenge
capturing
long‐distance
dependencies
in
evaluation,
which
leads
a
decline
predictive
performance
models.
To
address
challenge,
this
paper
presents
method
named
Qamhaen
evaluate
APR.
Specifically,
text
embedding
component
across
functions
evaluation
using
bug
reports
descriptions
inputs
instead
code
snippets.
BERT
is
employed
pretraining
capture
these
dependencies,
followed
an
additional
multihead
self‐attention
mechanism
further
feature
extraction.
Similarity
evaluator
devises
similarity
calculation
assess
effectiveness
resolving
issues
outlined
reports.
Comprehensive
experiments
conducted
on
dataset
containing
9135
assessment
metric,
extensive
demonstrate
that
outperforms
baseline
terms
overall
AUC
,
F1
+Recall
‐Recall
Precision
.
For
example,
compared
baseline,
achieves
0.691,
representing
improvements
24.2%,
22.1%,
6.3%
over
methods,
respectively.
Proceedings of the ACM on Programming Languages,
Journal Year:
2025,
Volume and Issue:
9(OOPSLA1), P. 1831 - 1857
Published: April 9, 2025
Automated
Program
Repair
(APR)
holds
the
promise
of
alleviating
burden
debugging
and
fixing
software
bugs.
Despite
this,
developers
still
need
to
manually
inspect
each
patch
confirm
its
correctness,
which
is
tedious
time-consuming.
This
challenge
exacerbated
in
presence
plausible
patches,
accidentally
pass
test
cases
but
may
not
correctly
fix
bug.
To
address
this
challenge,
we
propose
an
interactive
approach
called
iFix
facilitate
understanding
comparison
based
on
their
runtime
difference.
performs
static
analysis
identify
variables
related
buggy
statement
captures
values
during
execution
for
patch.
These
are
then
aligned
across
different
candidates,
allowing
users
compare
contrast
behavior.
evaluate
iFix,
conducted
a
within-subjects
user
study
with
28
participants.
Compared
manual
inspection
state-of-the-art
filtering
technique,
reduced
participants’
task
completion
time
by
36%
33%
while
also
improving
confidence
50%
20%,
respectively.
Besides,
quantitative
experiments
demonstrate
that
improves
ranking
correct
patches
at
least
39%
compared
other
methods
generalizable
APR
tools.
Journal of Software Evolution and Process,
Journal Year:
2025,
Volume and Issue:
37(4)
Published: April 1, 2025
ABSTRACT
The
recognition
of
Application
Programming
Interface
(API)
mentions
in
software‐related
texts
is
vital
for
extracting
API‐related
knowledge,
providing
deep
insights
into
API
usage
and
enhancing
productivity
efficiency.
Previous
research
identifies
two
primary
technical
challenges
this
task:
(1)
differentiating
APIs
from
common
words
(2)
identifying
morphological
variants
standard
APIs.
While
learning‐based
methods
have
demonstrated
advancements
addressing
these
challenges,
they
rely
heavily
on
high‐quality
labeled
data,
leading
to
another
significant
data‐related
challenge:
(3)
the
lack
such
data
due
substantial
effort
required
labeling.
To
overcome
paper
proposes
a
context‐aware
method
named
CARLDA.
This
approach
utilizes
key
components,
namely,
Bidirectional
Encoder
Representations
Transformers
(BERT)
Long
Short‐Term
Memory
(BiLSTM),
extract
context
at
both
word
sequence
levels,
capturing
syntactic
semantic
information
address
first
challenge.
For
second
challenge,
it
incorporates
character‐level
BiLSTM
with
an
attention
mechanism
grasp
global
context,
features
third
we
developed
specialized
augmentation
techniques
using
large
language
models
(LLMs)
tackle
in‐library
cross‐library
shortages.
These
generate
variety
samples
through
targeted
transformations
(e.g.,
replacing
tokens
restructuring
sentences)
hybrid
strategies
combining
real‐world
generated
while
applying
style
rules
replicate
authentic
programming
contexts).
Given
uncertainty
about
quality
LLM‐generated
samples,
also
sample
selection
algorithms
filter
out
low‐quality
(i.e.,
incomplete
or
incorrectly
samples).
Moreover,
specific
datasets
been
constructed
evaluate
CARLDA's
ability
aforementioned
challenges.
Experimental
results
demonstrate
that
CARLDA
significantly
enhances
F1
by
11.0%
Matthews
correlation
coefficient
(MCC)
10.0%
compared
state‐of‐the‐art
methods,
showing
superior
overall
performance
effectively
tackling
LLM‐based
successfully
yield
alleviate