Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
weight_quantization [2017/11/08 23:40]
admin
weight_quantization [2018/10/27 11:24]
admin
Line 117: Line 117:
  
 https://​arxiv.org/​abs/​1711.02213 Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks https://​arxiv.org/​abs/​1711.02213 Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks
 +
 +https://​las.inf.ethz.ch/​files/​djolonga17learning.pdf Differentiable Learning of Submodular Models
 +
 +In this paper we
 +focus on the problem of submodular minimization,​ for which we show that such
 +layers are indeed possible. The key idea is that we can continuously relax the output
 +without sacrificing guarantees. We provide an easily computable approximation
 +to the Jacobian complemented with a complete theoretical analysis. Finally, these
 +contributions let us experimentally learn probabilistic log-supermodular models
 +via a bi-level variational inference formulation.
 +
 +https://​openreview.net/​pdf?​id=B1IDRdeCW THE HIGH-DIMENSIONAL GEOMETRY OF BINARY
 +NEURAL NETWORKS
 +
 +https://​openreview.net/​pdf?​id=HJGXzmspb TRAINING AND INFERENCE WITH INTEGERS IN DEEP
 +NEURAL NETWORKS https://​github.com/​boluoweifenda/​WAGE
 +
 +https://​openreview.net/​forum?​id=S19dR9x0b Alternating Multi-bit Quantization for Recurrent Neural Networks
 +
 +In this work, we address these problems by quantizing the network, both weights and activations,​ into multiple binary codes {-1,+1}. We formulate the quantization as an optimization problem. Under the key observation that once the quantization coefficients are fixed the binary codes can be derived efficiently by binary search tree, alternating minimization is then applied.
 +
 +https://​arxiv.org/​pdf/​1712.05877.pdf Quantization and Training of Neural Networks for Efficient
 +Integer-Arithmetic-Only Inference
 +
 +https://​openreview.net/​forum?​id=B1IDRdeCWThe High-Dimensional Geometry of Binary Neural Networks ​
 +
 +Neural networks with binary weights and activations have similar performance to their continuous
 +counterparts with substantially reduced execution time and power usage. We provide an experimentally
 +verified theory for understanding how one can get away with such a massive reduction in
 +precision based on the geometry of HD vectors. First, we show that binarization of high-dimensional
 +vectors preserves their direction in the sense that the angle between a random vector and its binarized
 +version is much smaller than the angle between two random vectors (Angle Preservation
 +Property). Second, we take the perspective of the network and show that binarization approximately
 +preserves weight-activation dot products (Dot Product Proportionality Property). More generally,
 +when using a network compression technique, we recommend looking at the weight activation dot
 +product histograms as a heuristic to help localize the layers that are most responsible for performance
 +degradation. Third, we discuss the impacts of the low effective dimensionality of the data on the
 +first layer of the network. We recommend either using continuous weights for the first layer or a
 +Generalized Binarization Transformation. Such a transformation may be useful for architectures
 +like LSTMs where the update for the hidden state declares a particular set of axes to be important
 +(e.g. by taking the pointwise multiply of the forget gates with the cell state). Finally, we show that
 +neural networks with ternary weights and activations can also be understood with our approach. More
 +broadly speaking, our theory is useful for analyzing a variety of neural network compression techniques
 +that transform the weights, activations or both to reduce the execution cost without degrading
 +performance.
 +
 +https://​arxiv.org/​abs/​1706.02021 Network Sketching: Exploiting Binary Structure in Deep CNNs
 +
 +https://​arxiv.org/​abs/​1806.08342v1 Quantizing deep convolutional networks for efficient inference: A whitepaper
 +
 +https://​arxiv.org/​abs/​1808.08784v1 Sparsity in Deep Neural Networks - An Empirical Investigation with TensorQuant https://​github.com/​cc-hpc-itwm/​TensorQuant
 +
 +https://​arxiv.org/​abs/​1810.04714 Training Generative Adversarial Networks with Binary Neurons by End-to-end Backpropagation https://​github.com/​salu133445/​binarygan
 +