Preview

Sparse Matrix Factorization 1311

Better Essays
Open Document
Open Document
8763 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Sparse Matrix Factorization 1311
Sparse Matrix Factorization
Behnam Neyshabur1 and Rina Panigrahy2

arXiv:1311.3315v3 [cs.LG] 13 May 2014

1

Toyota Technological Institute at Chicago bneyshabur@ttic.edu 2
Microsoft Research rina@microsoft.com Abstract. We investigate the problem of factoring a matrix into several sparse matrices and propose an algorithm for this under randomness and sparsity assumptions. This problem can be viewed as a simplification of the deep learning problem where finding a factorization corresponds to finding edges in different layers and also values of hidden units. We prove that under certain assumptions on a sparse linear deep network with n nodes in each layer, our algorithm is able to recover the structure of the
˜ 1/6 ). network and values of top layer hidden units for depths up to O(n
We further discuss the relation among sparse matrix factorization, deep learning, sparse recovery and dictionary learning.
Keywords: Sparse Matrix Factorization, Dictionary Learning, Sparse
Encoding, Deep Learning

1

Introduction

In this paper we study the following matrix factorization problem. The sparsity π(X) of a matrix X is the number of non-zero entries in X.
Problem 1 (Sparse Matrix-Factorization). Given an input matrix Y factorize it is as Y = X1 X2 . . . Xs so as minimize the total sparsity si=1 π(Xi ).
The above problem is a simplification of the non-linear version of the problem that is directly related to learning using deep networks.
Problem 2 (Non-linear Sparse Matrix-Factorization). Given matrix Y , minimize si=1 π(Xi ) such that σ(X1 .σ(X2 .σ(. . . Xs ))) = Y where σ(x) is the sign function (+1 if x > 0, −1 if x < 0 and 0 otherwise) and σ applied on a matrix is simply applying the sign function on each entry. Here entries in Y are 0, ±1.
Connection to Deep Learning and Compression: The above problem is related to learning using deep networks (see [3]) that are generalizations of neural networks. They are layered network of nodes connected by edges between
successive



References: 3. Y. Bengio. Learning deep architectures for ai. Foundations and Trends in Machine Learning, 2009. 13. R. Salakhutdinov and G. E. Hinton. Deep boltzmann machines. Journal of Machine Learning Research, 5:448–455, 2009. 15. Li. Wan, Matthew. Zeiler, Sixin. Zhang, Yann. LeCun, and Rob. Fergus. Regularization of neural networks using dropconnect. ICML, 2013. 16. P. M. Wood. Universality and the circular law for sparse random matrices. The Annals of Applied Probability, 22(3):1266–1300, 2012.

You May Also Find These Documents Helpful