Journal of intelligence and knowledge engineering., Год журнала: 2024, Номер 2(3), С. 113 - 121
Опубликована: Сен. 1, 2024
In the era of big data, fully mining and utilizing value data in line with requirements strategy plays a significant role social development. Clustering algorithm can effectively partition unlabeled sets through unsupervised learning process, traditional K-Means is still most widely used at present. By studying various improved algorithms clustering algorithm, this paper has optimized problems such as unsatisfactory results caused by outliers disadvantages initial center point affecting partitioning. Good have been obtained. Firstly, grid filtering LOF detection method weighing distance density are to remove outliers. Then, randomness selection better eliminated combining "max-min principle" maximum weight, number clusters determined according BWP index. Experimental shown that compared currently popular algorithms, proposed GDD-K-Means achieved different sets, accuracy F-number other evaluation indexes certain extent, calculation time complexity reduced.
Язык: Английский