The emergence of spectral universality in deep networks

Recent work has shown that tight concentration of the entire spectrum of singular values of a deep network’s input-output Jacobian around one at initialization can speed up learning by orders of ma...