The Impact of Balancing Techniques and Feature Selection on Machine Learning Models for Diabetes Detection
Fırat Üniversitesi Mühendislik Bilimleri Dergisi,
Journal Year:
2025,
Volume and Issue:
37(1), P. 303 - 320
Published: Jan. 24, 2025
The
detection
of
diabetes
is
crucial
for
effective
management
and
prevention
the
disease,
which
poses
significant
health
risks
globally.
This
study
introduces
a
novel
approach
to
by
combining
advanced
data
balancing
techniques
feature
selection
methods,
including
Lasso
(L1)
regularization,
enhance
performance
predictive
models
in
imbalanced
datasets.
Techniques
such
as
Random
Under
Sampling
(RUS),
Adaptive
Synthetic
(ADASYN),
Minority
Over-sampling
Technique
(SMOTE)
were
employed
alongside
Forest
(RF),
CatBoost
(CB),
Extreme
Gradient
Boosting
(XGB),
K-Nearest
Neighbors
(KNN),
Gaussian
Naive
Bayes
(GNB),
Logistic
Regression
(LR),
(GB)
assess
their
impact
on
model
accuracy
generalization
capabilities.
findings
reveal
that
RF
achieved
highest
93.25%
when
utilizing
SMOTE
technique,
underscoring
importance
appropriate
handling
strategies
improving
outcomes.
Furthermore,
all
features
utilized
without
selection,
attained
an
95.31%,
indicating
model’s
capacity
capture
complex
patterns
richness
maximized.
comprehensive
methodology
used
higher
than
research
literature
provided
important
outputs
developing
reliable
prediction
healthcare.
Language: Английский
Application of Transformer-Based Deep Learning Models for Predicting the Suitability of Water for Agricultural Purposes
K. Rejini,
No information about this author
Visumathi James,
No information about this author
C. Heltin Genitha
No information about this author
et al.
Water,
Journal Year:
2025,
Volume and Issue:
17(9), P. 1347 - 1347
Published: April 30, 2025
Water
is
the
most
vital
component
for
sustainability
of
living
beings
on
Earth.
From
plants
to
human
beings,
every
single
being
Earth
needs
water
its
survival.
In
this
research,
a
novel
model
has
been
developed
in
order
predict
suitability
agricultural
purposes.
This
research
ALBERT
Base
v2
detecting
quality
and
named
Potability
Detection
(ALBERT-WPD)
model,
customized
from
transformer
model.
The
was
tested
using
dataset
Kaggle,
performance
evaluated.
used
ten
parameters.
both
models
measured
metrics,
accuracy,
precision,
recall,
F1-score.
traditional
(CNN
RNN)
were
compared
against
measure
efficiency
potability
prediction.
findings
revealed
that
gained
higher
accuracies
than
models:
91%
altered
ALBERT-WPD
rendered
96%
accuracy.
classification
results
(precision,
F1-score)
obtained
class
93%,
98%,
those
non-potability
95%,
96%,
respectively.
study
found
detection
procures
accuracy
with
optimization
method.
concludes
(BERT-based)
(>95%)
fewer
parameters
comparison
which
utilize
more
show
exhibit
rapid
data
processing
handle
large
datasets
efficiently;
handling
such
complicated
when
models,
as
they
have
vanishing
gradient
encounter
temporal
loss
challenges.
Thus,
significance
proposed
dwells
within
use
“transformers”
an
advanced
machine
learning
quality,
showing
transformers
are
future
learning.
Language: Английский