This is an old revision of the document!

# Random Matrix Theory

“Entropy is the price of structure.” - Ilya Prigogine

Aliases

This identifies the pattern and should be representative of the concept that it describes. The name should be a noun that should be easily usable within a sentence. We would like the pattern to be easily referenceable in conversation between practitioners.

Intent

The hidden structure of randomness.

Motivation

This section describes the reason why this pattern is needed in practice. Other pattern languages indicate this as the Problem. In our pattern language, we express this in a question or several questions and then we provide further explanation behind the question.

Sketch

This section provides alternative descriptions of the pattern in the form of an illustration or alternative formal expression. By looking at the sketch a reader may quickly understand the essence of the pattern.

Discussion

This is the main section of the pattern that goes in greater detail to explain the pattern. We leverage a vocabulary that we describe in the theory section of this book. We don’t go into intense detail into providing proofs but rather reference the sources of the proofs. How the motivation is addressed is expounded upon in this section. We also include additional questions that may be interesting topics for future research.

Known Uses

Here we review several projects or papers that have used this pattern.

Related Patterns In this section we describe in a diagram how this pattern is conceptually related to other patterns. The relationships may be as precise or may be fuzzy, so we provide further explanation into the nature of the relationship. We also describe other patterns may not be conceptually related but work well in combination with this pattern.

Relationship to Canonical Patterns

Relationship to other Patterns

We provide here some additional external material that will help in exploring this pattern in more detail.

References

http://arxiv.org/pdf/1607.06011v1.pdf On the Modeling of Error Functions as High Dimensional Landscapes for Weight Initialization in Learning Networks

http://arxiv.org/abs/1603.04733v5 Structured and Efficient Variational Deep Learning with Matrix Gaussian Posteriors

We introduce a scalable variational Bayesian neural network where the parameters are governed by a probability distribution over random matrices: the matrix variate Gaussian. By utilizing properties of this distribution we can see that our model can be considered as a composition of Gaussian Processes with nonlinear kernels of a specific form. This kernel is formed from the kroneker product of two separate kernels; a global output kernel and an input specific kernel, where the latter is composed from fixed dimension nonlinear basis functions (the inputs to each layer) weighted by their covariance. We continue in exploiting this duality and introduce pseudo input-output pairs for each layer in the network, which in turn better maintain the Gaussian Process properties of our model thus increasing the flexibility of the posterior distribution.

http://bookstore.ams.org/gsm-132 Topics in Random Matrix Theory

https://arxiv.org/pdf/0909.3912.pdf A Random Matrix Approach to Language Acquisition

Our model of linguistic interaction is analytically studied using methods of statistical physics and simulated by Monte Carlo techniques. The analysis reveals an intricate relationship between the innate propensity for language acquisition β and the lexicon size N, N ∼ exp(β).

http://openreview.net/pdf?id=B186cP9gx SINGULARITY OF THE HESSIAN IN DEEP LEARNING

We look at the eigenvalues of the Hessian of a loss function before and after training. The eigenvalue distribution is seen to be composed of two parts, the bulk which is concentrated around zero, and the edges which are scattered away from zero. We present empirical evidence for the bulk indicating how overparametrized the system is, and for the edges indicating the complexity of the input data.

https://arxiv.org/pdf/1711.04735.pdf Resurrecting the sigmoid in deep learning through dynamical isometry: theory and practice