Training Set Compression by Incremental Clustering
The Journal of Pattern Recognition Research (JPRR) provides an international forum for the electronic publication of high-quality research and industrial experience articles in all areas of pattern recognition, machine learning, and artificial intelligence. JPRR is committed to rigorous yet rapid reviewing. Final versions are published electronically
(ISSN 1558-884X) immediately upon acceptance.
Training Set Compression by Incremental Clustering
Dalong Li, Steven Simske
JPRR Vol 6, No 1 (2011); doi:10.13176/11.254 
Download
Dalong Li, Steven Simske
Abstract
Compression of training sets is a technique for reducing training set size without degrading classification accuracy. By reducing the size of a training set, training will be more efficient in addition to saving storage space. In this paper, an incremental clustering algorithm, the Leader algorithm, is used to reduce the size of a training set by effectively subsampling the training set. Experiments on several standard data sets using SVM and KNN as classifiers indicate that the proposed method is more efficient than CONDENSE in reducing the size of training set without degrading the classification accuracy. While the compression ratio for the CONDENSE method is fixed, the proposed method offers variable compression ratio through the cluster threshold value.
JPRR Vol 6, No 1 (2011); doi:10.13176/11.254 | Full Text  | Share this paper: