GDD-K-Means Text Clustering Algorithm Based on Grid Filtering Distance and Density of Outliers DOI
Yao Wang, Bin Wang,

X.M. Qi

и другие.

Journal of intelligence and knowledge engineering., Год журнала: 2024, Номер 2(3), С. 113 - 121

Опубликована: Сен. 1, 2024

In the era of big data, fully mining and utilizing value data in line with requirements strategy plays a significant role social development. Clustering algorithm can effectively partition unlabeled sets through unsupervised learning process, traditional K-Means is still most widely used at present. By studying various improved algorithms clustering algorithm, this paper has optimized problems such as unsatisfactory results caused by outliers disadvantages initial center point affecting partitioning. Good have been obtained. Firstly, grid filtering LOF detection method weighing distance density are to remove outliers. Then, randomness selection better eliminated combining "max-min principle" maximum weight, number clusters determined according BWP index. Experimental shown that compared currently popular algorithms, proposed GDD-K-Means achieved different sets, accuracy F-number other evaluation indexes certain extent, calculation time complexity reduced.

Язык: Английский

Machine Learning Analysis of Carbon Inclusion Drivers in Chinese Universities DOI
Dongping Tang, A. Abdulraheem, Xiaoli Wu

и другие.

Опубликована: Янв. 10, 2025

Язык: Английский

Процитировано

0

GDD-K-Means Text Clustering Algorithm Based on Grid Filtering Distance and Density of Outliers DOI
Yao Wang, Bin Wang,

X.M. Qi

и другие.

Journal of intelligence and knowledge engineering., Год журнала: 2024, Номер 2(3), С. 113 - 121

Опубликована: Сен. 1, 2024

In the era of big data, fully mining and utilizing value data in line with requirements strategy plays a significant role social development. Clustering algorithm can effectively partition unlabeled sets through unsupervised learning process, traditional K-Means is still most widely used at present. By studying various improved algorithms clustering algorithm, this paper has optimized problems such as unsatisfactory results caused by outliers disadvantages initial center point affecting partitioning. Good have been obtained. Firstly, grid filtering LOF detection method weighing distance density are to remove outliers. Then, randomness selection better eliminated combining "max-min principle" maximum weight, number clusters determined according BWP index. Experimental shown that compared currently popular algorithms, proposed GDD-K-Means achieved different sets, accuracy F-number other evaluation indexes certain extent, calculation time complexity reduced.

Язык: Английский

Процитировано

0