剪枝:[1] Song Han, Huizi Mao,William J.Dally. Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman coding[C] ICLR2016.
量化:[1] Jacob B, Kligys S, Chen B, et al. Quantization and training of neural networks for efficient integer-arithmetic-only inference[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 2704-2713