Towards Convolutional Neural Network Compression via K-Means Cluster
Abstract— With the rapid development of computer field, the computing power is no longer the bottleneck of the machine learning research. At present, deep learning technologies such as DNN (Deep Neural Networks) have been widely used in various fields such as speech recognition, smart driving, image recognition and so on. However, as the structure of DNN becomes more complex and the requirements of accuracy become higher, the network size is getting larger. The storage costs of such networks are high, which prohibits their usage in resource-constrained devices (embedded devices or mobile devices). In this paper, we propose a new compression method, compressing the DNN models without losing the accuracy by clustering the trained weights. Specifically, in our work, K-means is used to cluster the weights of fully connected layers. Then, we encode the clustered value as an index label. When we store the network model, we only store the index labels of weights. Compared with the original 32-bits weight, the index label is usually only three or four bits (based on the number of clusters), so as to achieve the purpose of compressing. Our compressing method achieves 11.4x compression rate on AlexNet network.
Index Terms— neural network, compression, cluster, K-means.
Guilin Chen, Sheng Ma, Yang Guo
College of computer, National University of Defense Technology, CHINA
Cite: Guilin Chen, Sheng Ma, Yang Guo, "Towards Convolutional Neural Network Compression via K-Means Cluster," Proceedings of 2018 the 8th International Workshop on Computer Science and Engineering, pp. 282-287, Bangkok, 28-30 June, 2018.