Novel resampling algorithms with maximal cliques for class-imbalance problems DOI

Long-hui Wang,

Qi Dai,

Tony Du

et al.

Computers & Industrial Engineering, Journal Year: 2024, Volume and Issue: unknown, P. 110754 - 110754

Published: Nov. 1, 2024

Language: Английский

Class overlap handling methods in imbalanced domain: A comprehensive survey DOI
Anil Kumar, Dinesh Singh, Rama Shankar Yadav

et al.

Multimedia Tools and Applications, Journal Year: 2024, Volume and Issue: 83(23), P. 63243 - 63290

Published: Jan. 11, 2024

Language: Английский

Citations

11

A majority affiliation based under-sampling method for class imbalance problem DOI
Ying Xie, X. Huang, Feng Qin

et al.

Information Sciences, Journal Year: 2024, Volume and Issue: 662, P. 120263 - 120263

Published: Feb. 1, 2024

Language: Английский

Citations

4

Entropy‐based hybrid sampling (EHS) method to handle class overlap in highly imbalanced dataset DOI Open Access
Anil Kumar,

Dinesh Singh,

Rama Shankar Yadav

et al.

Expert Systems, Journal Year: 2024, Volume and Issue: 41(11)

Published: July 30, 2024

Abstract Class imbalance and class overlap create difficulties in the training phase of standard machine learning algorithm. Its performance is not well minority classes, especially when there a high significant overlap. Recently it has been observed by researchers that, joint effects are more harmful as compared to their direct impact. To handle these problems, many methods have proposed past years that can be broadly categorized data‐level, algorithm‐level, ensemble learning, hybrid methods. Existing data‐level often suffer from problems like information loss overfitting. overcome we introduce novel entropy‐based sampling (EHS) method highly imbalanced datasets. The EHS eliminates less informative majority instances region during undersampling regenerates synthetic oversampling near borderline. achieved improvement F1‐score, G‐mean, AUC metrics value DT, NB, SVM classifiers well‐established state‐of‐the‐art Classifiers performances tested on 28 datasets with extreme ranges

Language: Английский

Citations

3

Enhancing data classification using locally informed weighted k-nearest neighbor algorithm DOI
Hassan I. Abdalla, Ali A. Amer

Expert Systems with Applications, Journal Year: 2025, Volume and Issue: 276, P. 126942 - 126942

Published: March 16, 2025

Language: Английский

Citations

0

Boosting semi‐supervised learning under imbalanced regression via pseudo‐labeling DOI
Nannan Zong, Songzhi Su, Changle Zhou

et al.

Concurrency and Computation Practice and Experience, Journal Year: 2024, Volume and Issue: unknown

Published: June 30, 2024

Summary Imbalanced samples are widespread, which impairs the generalization and fairness of models. Semi‐supervised learning can overcome deficiency rare labeled samples, but it is challenging to select high‐quality pseudo‐label data. Unlike discrete labels that be matched one‐to‐one with points on a numerical axis, in regression tasks consecutive cannot directly chosen. Besides, distribution unlabeled data imbalanced, easily leads an imbalanced data, exacerbating imbalance semi‐supervised dataset. To solve this problem, article proposes network (SIRN), consists two components: A, designed learn relationship between features (targets), B, dedicated target deviations. measure deviations under distribution, deviation function introduced. continuous pseudo‐labels, matching strategy designed. Furthermore, adaptive selection developed mitigate risk skewed distributions due Finally, effectiveness proposed method validated through evaluations tasks. The results show great reduction predicted value error, particularly few‐shot regions. This empirical evidence confirms efficacy our addressing issue

Language: Английский

Citations

1

Newton cooling theorem-based local overlapping regions cleaning and oversampling techniques for imbalanced datasets DOI
Liangliang Tao, Qingya Wang,

Fen Yu

et al.

Neurocomputing, Journal Year: 2024, Volume and Issue: unknown, P. 128959 - 128959

Published: Nov. 1, 2024

Language: Английский

Citations

1

Generative adversarial networks for overlapped and imbalanced problems in impact damage classification DOI
Quoc Hoan Doan, Behrooz Keshtegar,

Seung-Eock Kim

et al.

Information Sciences, Journal Year: 2024, Volume and Issue: 675, P. 120752 - 120752

Published: May 18, 2024

Language: Английский

Citations

0

An Anomaly Detection Model Based on Pearson Correlation Coefficient and Gradient Booster Mechanism DOI Open Access

Tuo Ding,

He Sui

International Journal of Advanced Computer Science and Applications, Journal Year: 2024, Volume and Issue: 15(6)

Published: Jan. 1, 2024

Anomaly detection aims to build a decision model that estimates the class of new data based on historical sample features. However, distance between samples in feature space is very close sometimes, resulting being invisible overlap problem. To address this issue, an anomaly Pearson correlation coefficient and gradient booster mechanism proposed paper. Different from traditional resampling methods, method groups sorts features different dimensions such as correlation, importance, exclusivity firstly. Then, it selects with higher lower importance for deletion improve training accuracy detector. Furthermore, through unilateral sampling mechanism, ineffective or inefficient can be further reduced efficiency Finally, was compared three selection methods six ensemble models datasets. The experimental results showed has significant advantages selection, performance, stability, computational cost.

Language: Английский

Citations

0

Handling the Imbalanced Problem in Agri-Food Data Analysis DOI Creative Commons
Adeyemi Adegbenjo, Michael Ngadi

Foods, Journal Year: 2024, Volume and Issue: 13(20), P. 3300 - 3300

Published: Oct. 17, 2024

Imbalanced data situations exist in most fields of endeavor. The problem has been identified as a major bottleneck machine learning/data mining and is becoming serious issue concern food processing applications. Inappropriate analysis agricultural was limiting the robustness predictive models built from agri-food As result rare cases occurring infrequently, classification rules that detect small groups are scarce, so samples belonging to classes largely misclassified. Most existing learning algorithms including K-means, decision trees, support vector machines (SVMs) not optimal handling imbalanced data. Consequently, developed such very prone rejection non-adoptability real industrial commercial settings. This paper showcases reality applications therefore proposes some state-of-the-art artificial intelligence algorithm approaches for using methods resampling, one-class learning, ensemble methods, feature selection, deep techniques. further evaluates newer metrics well suited Rightly analyzing application research works will improve accuracy results model developments. consequently enhance acceptability adoptability innovations/inventions.

Language: Английский

Citations

0

PRO-SMOTEBoost: An Adaptive SMOTEBoost Probabilistic Algorithm for Rebalancing and Improving Imbalanced Data Classification DOI
Laouni Djafri

Information Sciences, Journal Year: 2024, Volume and Issue: unknown, P. 121548 - 121548

Published: Oct. 1, 2024

Language: Английский

Citations

0