A Classification Model Based on Improved Reject Subspace Density Covering Algorithm

Yuan Zhang, Yan-ping Zhang, Fu-lan Qian, Qing-wei Shen


Data set classification is the main function of machine learning, and high dimensional characteristics of sample data sets has been one of the key point of machine learning research. In order to solve this problem, this paper puts forward a classification model based on improved reject subspace density covering. Firstly, it adopts optimized density covering ideas to do subspace classification for data set; Secondly, it continues to use the idea of subspace to classify cover sample points whose density is less than the threshold value until all points are covered effectively; Finally, in view of the reject sample data in algorithm test, overall considering the sample number inside of the subspace, explosion radius and the relationship between explosion radius and reject point to give out a new sample rejection strategy, the algorithm can be divided into two parts, training and testing. Through the experimental results we know that the algorithm can improve the generalization ability of classifier on the performance of high-dimensional data sets and noise data set, it has good accuracy which can provide a new way for high-dimensional or large scale data containing noise classification.


subspace, covering algorithm, density optimization, sample rejection classification, generalization capability.

