Decision Support Systems,
Год журнала:
2024,
Номер
180, С. 114196 - 114196
Опубликована: Фев. 19, 2024
Categorization
is
one
of
the
basic
tasks
in
machine
learning
and
data
analysis.
Building
on
formal
concept
analysis
(FCA),
starting
point
present
work
that
different
ways
to
categorize
a
given
set
objects
exist,
which
depend
choice
sets
features
used
classify
them,
such
may
yield
better
or
worse
categorizations,
relative
task
at
hand.
In
their
turn,
(a
priori)
particular
over
another
might
be
subjective
express
certain
epistemic
stance
(e.g.
interests,
relevance,
preferences)
an
agent
group
agents,
namely,
interrogative
agenda.
paper,
we
represent
agendas
as
features,
explore
compare
w.r.t.
(agendas).
We
first
develop
simple
unsupervised
FCA-based
algorithm
for
outlier
detection
uses
categorizations
arising
from
agendas.
then
supervised
meta-learning
learn
suitable
(fuzzy)
categorization
with
weights
masses.
combine
this
obtain
algorithm.
show
these
algorithms
perform
par
commonly
datasets
detection.
These
provide
both
local
global
explanations
results.
IEEE Transactions on Fuzzy Systems,
Год журнала:
2023,
Номер
31(12), С. 4516 - 4528
Опубликована: Июнь 23, 2023
The
performance
of
multilabel
learning
depends
heavily
on
the
quality
input
features.
A
mass
irrelevant
and
redundant
features
may
seriously
affect
learning,
feature
selection
is
an
effective
technique
to
solve
this
problem.
However,
most
methods
mainly
emphasize
removing
these
useless
features,
exploration
interaction
ignored.
Moreover,
widespread
existence
real-world
data
with
uncertainty,
ambiguity,
noise
limits
selection.
To
end,
our
work
dedicated
designing
efficient
robust
scheme.
First,
distribution
character
analyzed
generate
fuzzy
multineighborhood
granules.
By
exploring
classification
information
implied
in
under
granularity
structure,
a
$k$
-nearest
neighbor
rough
set
model
constructed,
concept
dependency
studied.
Second,
series
uncertainty
measures
approximation
spaces
are
studied
analyze
correlations
pairs,
including
interactivity.
Third,
by
investigating
measure
between
label,
modeled
as
complete
weighted
graph.
Then,
vertices
assessed
iteratively
guide
assignment
weights.
Finally,
graph
structure-based
algorithm
(GRMFS)
designed.
experiments
conducted
15
datasets.
results
verify
superior
GRMFS
compared
nine
representative
methods.
IEEE Transactions on Knowledge and Data Engineering,
Год журнала:
2023,
Номер
36(5), С. 2082 - 2095
Опубликована: Сен. 5, 2023
Outliers
carry
significant
information
to
reflect
an
anomaly
mechanism,
so
outlier
detection
facilitates
relevant
data
mining.
In
terms
of
detection,
the
classical
approaches
from
distances
apply
numerical
rather
than
nominal
data,
while
recent
methods
on
basic
rough
sets
deal
with
data.
Aiming
at
wide
numerical,
nominal,
and
hybrid
this
paper
investigates
three-way
neighborhood
characteristic
regions
corresponding
fusion
measurement
advance
detection.
First,
are
deepened
via
decision,
they
derive
structures
model
boundaries,
inner
regions,
regions.
Second,
motivate
weight
regarding
all
features,
thus,
a
multiple
factor
emerges
establish
new
method
detection;
furthermore,
algorithm
(called
3WNCROD)
is
designed
comprehensively
process
mixed
Finally,
3WNCROD
experimentally
validated,
it
generally
outperforms
13
contrast
algorithms
perform
better
for
International Journal of Data Science and Analytics,
Год журнала:
2024,
Номер
unknown
Опубликована: Май 20, 2024
Abstract
Outlier
detection
is
a
widely
used
technique
for
identifying
anomalous
or
exceptional
events
across
various
contexts.
It
has
proven
to
be
valuable
in
applications
like
fault
detection,
fraud
and
real-time
monitoring
systems.
Detecting
outliers
real
time
crucial
several
industries,
such
as
financial
quality
control
manufacturing
processes.
In
the
context
of
big
data,
amount
data
generated
enormous,
traditional
batch
mode
methods
are
not
practical
since
entire
dataset
available.
The
limited
computational
resources
further
compound
this
issue.
Boxplot
algorithm
outlier
that
involves
derivations.
However,
lack
an
incremental
closed
form
statistical
calculations
during
boxplot
construction
poses
considerable
challenges
its
application
within
realm
data.
We
propose
incremental/online
version
address
these
challenges.
Our
proposed
based
on
approximation
approach
numerical
integration
histogram
calculation
cumulative
distribution
function.
This
independent
dataset’s
distribution,
making
it
effective
all
types
distributions,
whether
skewed
not.
To
assess
efficacy
algorithm,
we
conducted
tests
using
simulated
datasets
featuring
varying
degrees
skewness.
Additionally,
applied
real-world
concerning
software
which
posed
challenge.
experimental
results
underscored
robust
performance
our
highlighting
comparable
access
dataset.
online
method,
leveraging
define
whiskers,
consistently
achieved
results.
Notably,
demonstrated
efficiency,
maintaining
constant
memory
usage
with
minimal
hyperparameter
tuning.