Computers & Industrial Engineering, Journal Year: 2024, Volume and Issue: unknown, P. 110754 - 110754
Published: Nov. 1, 2024
Language: Английский
Computers & Industrial Engineering, Journal Year: 2024, Volume and Issue: unknown, P. 110754 - 110754
Published: Nov. 1, 2024
Language: Английский
Multimedia Tools and Applications, Journal Year: 2024, Volume and Issue: 83(23), P. 63243 - 63290
Published: Jan. 11, 2024
Language: Английский
Citations
11Information Sciences, Journal Year: 2024, Volume and Issue: 662, P. 120263 - 120263
Published: Feb. 1, 2024
Language: Английский
Citations
4Expert Systems, Journal Year: 2024, Volume and Issue: 41(11)
Published: July 30, 2024
Abstract Class imbalance and class overlap create difficulties in the training phase of standard machine learning algorithm. Its performance is not well minority classes, especially when there a high significant overlap. Recently it has been observed by researchers that, joint effects are more harmful as compared to their direct impact. To handle these problems, many methods have proposed past years that can be broadly categorized data‐level, algorithm‐level, ensemble learning, hybrid methods. Existing data‐level often suffer from problems like information loss overfitting. overcome we introduce novel entropy‐based sampling (EHS) method highly imbalanced datasets. The EHS eliminates less informative majority instances region during undersampling regenerates synthetic oversampling near borderline. achieved improvement F1‐score, G‐mean, AUC metrics value DT, NB, SVM classifiers well‐established state‐of‐the‐art Classifiers performances tested on 28 datasets with extreme ranges
Language: Английский
Citations
3Expert Systems with Applications, Journal Year: 2025, Volume and Issue: 276, P. 126942 - 126942
Published: March 16, 2025
Language: Английский
Citations
0Concurrency and Computation Practice and Experience, Journal Year: 2024, Volume and Issue: unknown
Published: June 30, 2024
Summary Imbalanced samples are widespread, which impairs the generalization and fairness of models. Semi‐supervised learning can overcome deficiency rare labeled samples, but it is challenging to select high‐quality pseudo‐label data. Unlike discrete labels that be matched one‐to‐one with points on a numerical axis, in regression tasks consecutive cannot directly chosen. Besides, distribution unlabeled data imbalanced, easily leads an imbalanced data, exacerbating imbalance semi‐supervised dataset. To solve this problem, article proposes network (SIRN), consists two components: A, designed learn relationship between features (targets), B, dedicated target deviations. measure deviations under distribution, deviation function introduced. continuous pseudo‐labels, matching strategy designed. Furthermore, adaptive selection developed mitigate risk skewed distributions due Finally, effectiveness proposed method validated through evaluations tasks. The results show great reduction predicted value error, particularly few‐shot regions. This empirical evidence confirms efficacy our addressing issue
Language: Английский
Citations
1Neurocomputing, Journal Year: 2024, Volume and Issue: unknown, P. 128959 - 128959
Published: Nov. 1, 2024
Language: Английский
Citations
1Information Sciences, Journal Year: 2024, Volume and Issue: 675, P. 120752 - 120752
Published: May 18, 2024
Language: Английский
Citations
0International Journal of Advanced Computer Science and Applications, Journal Year: 2024, Volume and Issue: 15(6)
Published: Jan. 1, 2024
Anomaly detection aims to build a decision model that estimates the class of new data based on historical sample features. However, distance between samples in feature space is very close sometimes, resulting being invisible overlap problem. To address this issue, an anomaly Pearson correlation coefficient and gradient booster mechanism proposed paper. Different from traditional resampling methods, method groups sorts features different dimensions such as correlation, importance, exclusivity firstly. Then, it selects with higher lower importance for deletion improve training accuracy detector. Furthermore, through unilateral sampling mechanism, ineffective or inefficient can be further reduced efficiency Finally, was compared three selection methods six ensemble models datasets. The experimental results showed has significant advantages selection, performance, stability, computational cost.
Language: Английский
Citations
0Foods, Journal Year: 2024, Volume and Issue: 13(20), P. 3300 - 3300
Published: Oct. 17, 2024
Imbalanced data situations exist in most fields of endeavor. The problem has been identified as a major bottleneck machine learning/data mining and is becoming serious issue concern food processing applications. Inappropriate analysis agricultural was limiting the robustness predictive models built from agri-food As result rare cases occurring infrequently, classification rules that detect small groups are scarce, so samples belonging to classes largely misclassified. Most existing learning algorithms including K-means, decision trees, support vector machines (SVMs) not optimal handling imbalanced data. Consequently, developed such very prone rejection non-adoptability real industrial commercial settings. This paper showcases reality applications therefore proposes some state-of-the-art artificial intelligence algorithm approaches for using methods resampling, one-class learning, ensemble methods, feature selection, deep techniques. further evaluates newer metrics well suited Rightly analyzing application research works will improve accuracy results model developments. consequently enhance acceptability adoptability innovations/inventions.
Language: Английский
Citations
0Information Sciences, Journal Year: 2024, Volume and Issue: unknown, P. 121548 - 121548
Published: Oct. 1, 2024
Language: Английский
Citations
0