VAWKUM Transactions on Computer Sciences,
Journal Year:
2024,
Volume and Issue:
12(2), P. 16 - 27
Published: Oct. 9, 2024
In
the
world
of
telecommunications
businesses,
customer
turnover
poses
a
significant
hurdle
that
can
impact
profits
and
weaken
loyalty
over
time.
Our
solution
to
this
challenge
involves
method
using
Machine
Learning
(ML)
tools
predict
churn,
with
precision.
We
work
set
7In
our
research
study
we
examined
how
well
three
different
machine
learning
models
performed.
Random
Forest
(RF)
Cat
Boost
(CB)
K
nearest
neighbors
(KNN).
Out
these
tested
model
stood
out
for
its
performance
achieving
99
percent
accuracy
precision
along
an
88
recall
rate
F1
score;
additionally,
it
achieved
AUC
0.99.
These
results
clearly
demonstrate
model's
ability,
in
identifying
customers
who
are
likely
churn.
The
findings
hold
importance
companies
as
they
equipped
valuable
resource
proactively
tackle
issues
customize
solutions
retain
key
clients
while
boosting
overall
happiness
levels
increasingly
competitive
market
landscape
where
keeping
is
crucial
business
success
provides
data
supported
roadmap
continual
expansion
staying
ahead
telecom
industry
spotlighted
abstract
critical
relevance
churn
prediction
firms
underscored
by
tangible
advantages
leveraging
predicting
By
utilizing
advanced
technology,
identify
at-risk
take
targeted
measures
prevent
them
from
leaving.
This
not
only
helps
but
also
improves
satisfaction.
constantly
evolving
market,
having
access
predictive
analytics
give
edge
ensure
long-term
industry.
Technologies,
Journal Year:
2025,
Volume and Issue:
13(3), P. 88 - 88
Published: Feb. 20, 2025
This
study
examines
the
efficacy
of
Random
Forest
and
XGBoost
classifiers
in
conjunction
with
three
upsampling
techniques—SMOTE,
ADASYN,
Gaussian
noise
(GNUS)—across
datasets
varying
class
imbalance
levels,
ranging
from
moderate
to
extreme
(15%
1%
churn
rate).
Employing
metrics
such
as
F1
score,
ROC
AUC,
PR
Matthews
Correlation
Coefficient
(MCC),
Cohen’s
Kappa,
this
research
provides
a
comprehensive
evaluation
classifier
performance
under
different
scenarios,
focusing
on
applications
telecommunications
domain.
The
findings
highlight
that
tuned
paired
SMOTE
(Tuned_XGB_SMOTE)
consistently
achieves
highest
score
robust
across
all
levels.
emerged
most
effective
method,
particularly
when
used
XGBoost,
whereas
performed
poorly
severe
imbalance.
ADASYN
showed
effectiveness
but
underperformed
Forest,
GNUS
produced
inconsistent
results.
underscores
impact
data
imbalance,
MCC,
scores
fluctuating
significantly,
AUC
remained
relatively
stable.
Moreover,
rigorous
statistical
analyses
employing
Friedman
test
Nemenyi
post
hoc
comparisons
confirmed
observed
improvements
PR-AUC,
MCC
were
statistically
significant
(p
<
0.05),
Tuned_XGB_SMOTE
significantly
outperforming
Tuned_RF_GNUS.
While
differences
ROC-AUC
not
significant,
consistency
these
results
multiple
reliability
our
framework,
offering
validated
attractive
solution
for
model
selection
imbalanced
classification
scenarios.
Algorithms,
Journal Year:
2025,
Volume and Issue:
18(3), P. 167 - 167
Published: March 14, 2025
With
the
rapid
development
of
industrialization
and
urbanization,
air
pollution
is
becoming
increasingly
serious.
Accurate
prediction
PM2.5
concentration
great
significance
to
environmental
protection
public
health.
Our
study
takes
Nanning
urban
area,
which
has
unique
geographical,
climatic
source
characteristics,
as
object.
Based
on
dual-time
resolution
raster
data
China
High-resolution
High-quality
Dataset
(CHAP)
from
2012
2023,
carried
out
using
SARIMA,
Prophet
LightGBM
models.
The
systematically
compares
performance
each
model
spatial
temporal
dimensions
indicators
such
mean
square
error
(MSE),
absolute
(MAE)
coefficient
determination
(R2).
results
show
that
a
strong
ability
mine
complex
nonlinear
relationships,
but
its
stability
poor.
obvious
advantages
in
dealing
with
seasonality
trend
time
series,
it
lacks
adaptability
changes.
SARIMA
based
series
theory
performs
well
some
scenarios,
limitations
non-stationary
heterogeneity.
research
provides
multi-dimensional
reference
for
subsequent
predictions,
helps
researchers
select
models
reasonably
according
different
scenarios
needs,
new
ideas
analyzing
change
patterns,
promotes
related
field
science.
Applied Sciences,
Journal Year:
2025,
Volume and Issue:
15(3), P. 1599 - 1599
Published: Feb. 5, 2025
Predicting
customer
churn
is
essential
for
telecommunications
companies
to
maintain
profitability.
However,
training
models
on
historical
leads
performance
degradation
when
they
are
applied
future
conditions—a
phenomenon
known
as
concept
drift.
We
employ
a
sliding-window
approach
that
separates
the
and
testing
time
windows,
creating
future-based
“true
test”.
Using
unique
real
data,
we
show
CatBoost
classifier
model
trained
older
data
can
remain
relevant
new,
unseen
intervals
used.
A
key
innovation
of
our
work
use
40-day
“partial
churn”
labels;
these
labels
accurately
predicts
90-day
by
simply
adjusting
decision
threshold.
Out
six
modeled
scenarios,
in
main
realistic
scenario,
retained
an
accuracy
above
0.798
F1
near
0.704,
reflecting
its
robustness
even
under
real-world
delays
potential
Overall,
findings
emphasize
do
not
necessarily
“expire”
with
time;
rather,
their
varies
according
tested.
This
research
underscores
importance
truly
evaluation
(instead
artificial
splits)
offers
practical
guidance
earlier
detection
facing
delays.
AI,
Journal Year:
2025,
Volume and Issue:
6(4), P. 73 - 73
Published: April 10, 2025
The
banking
industry
faces
significant
challenges,
from
high
customer
churn
rates
to
threatening
long-term
revenue
generation.
Traditionally,
models
assess
service
quality
using
satisfaction
metrics;
however,
these
subjective
variables
often
yield
low
predictive
accuracy.
This
study
examines
the
relationship
between
attrition
and
account
balance
decision
trees
(DT),
random
forests
(RF),
gradient-boosting
machines
(GBM).
research
utilises
a
dataset
applies
synthetic
oversampling
class
distribution
during
preprocessing
of
financial
variables.
Account
is
primary
factor
in
predicting
churn,
as
it
yields
more
accurate
predictions
compared
traditional
assessment
methods.
tested
model
set
achieved
its
highest
performance
by
applying
boosting
evaluation
data
highlights
critical
role
indicators
shaping
effective
retention
strategies.
By
leveraging
machine
learning
intelligence,
banks
can
make
informed
decisions,
attract
new
clients,
mitigate
risk,
ultimately
enhancing
results.
Scientific Reports,
Journal Year:
2025,
Volume and Issue:
15(1)
Published: May 9, 2025
This
study
examines
how
imbalanced
datasets
affect
the
accuracy
of
machine
learning
models,
especially
in
predictive
analytics
applications
such
as
churn
prediction.
When
are
skewed
towards
majority
class,
it
can
lead
to
biased
model
performance,
reducing
overall
effectiveness.
To
analyze
this
impact,
research
utilizes
a
dataset
evaluate
data
imbalance
influences
accuracy.
The
utilized
nine
individual
classifiers
along
with
six
homogeneous
ensemble
models
effects
on
performance.
Single
classifier
struggle
identify
underlying
patterns
data,
while
ensembles
improve
performance
by
focusing
minority
class.
However,
when
trained
unbalanced
their
remains
subpar.
top
were
selected
for
further
investigation
based
data.
A
SMOTE
sampling
technique
was
employed
create
balanced
dataset,
ensuring
that
all
classes
adequately
represented.
generated
model's
improved
from
61
79%,
indicating
removal
bias
target
results
showed
Adaboost,
an
optimal
classifier,
demonstrated
superior
F1-Score
87.6%
identifying
potential
and
assessing
customer
account
health.
findings
emphasize
importance
accurate
ML
predictions.
Applied Sciences,
Journal Year:
2024,
Volume and Issue:
14(20), P. 9226 - 9226
Published: Oct. 11, 2024
Customer
retention
is
a
key
priority
for
mobile
telecommunications
companies,
as
acquiring
new
customers
significantly
more
costly
than
retaining
existing
ones.
A
major
challenge
in
this
field
predicting
customer
churn—users
discontinuing
services.
Traditional
predictive
models
such
rule-based
systems
often
struggle
with
the
complex,
non-linear
nature
of
behavior.
To
address
this,
we
propose
use
deep
learning
techniques,
specifically
multi-layer
perceptron
(MLP)
and
radial
basis
function
(RBF)
networks,
to
improve
accuracy
churn
predictions.
However,
while
neural
networks
excel
performance,
they
are
criticized
being
“black-box”
models,
lacking
interpretability.
real-world
data
set
considered,
which
originally
contained
information
about
15,000
randomly
selected
clients.
Various
network
structures
configurations
analyzed.
The
obtained
results
compared
generated
using
fuzzy
rough-set
systems.
MLP
model
achieved
an
almost
perfect
0.999
F-measure
0.989,
outperforming
traditional
methods
Although
RBF
slightly
lagged
accuracy,
it
demonstrated
superior
recall
0.993,
indicating
better
identification
potential
churners.
These
demonstrate
that
enhance
performance
modeling.
interpretability
also
discussed
since
bears
significance
real
applications.
Our
contribution
lies
showing
prediction
though
remains
critical
area
future
work.
Business Strategy & Development,
Journal Year:
2024,
Volume and Issue:
7(4)
Published: Nov. 5, 2024
Abstract
Employee
churn
or
attrition
presents
significant
challenges,
especially
in
emerging
markets,
where
it
can
disrupt
business
operations
and
inflate
recruitment
costs.
This
research
leverages
machine
learning
techniques
to
predict
employee
churn,
focusing
on
developing
sustainable
inclusive
retention
strategies
that
enhance
competitiveness.
By
analyzing
a
range
of
predictive
algorithms
key
variables
associated
with
the
study
identifies
most
effective
models
for
predicting
attrition.
A
comprehensive
exploratory
data
analysis
was
conducted
using
an
indigenous
model,
offering
practical
insights
human
resource
management
markets.
The
findings
align
development
goals
(SDGs),
promoting
decent
work,
economic
growth.
contributes
strategy
by
proposing
data‐driven
solutions
workforce
stability
development.