Adapting to Unknown Noise Distribution in Matrix Denoising
We consider the problem of estimating an unknown matrix ∈^m× n, from observations = + where is a noise matrix with independent and identically distributed entries, as to minimize estimation error measured in operator norm. Assuming that the underlying signal is low-rank and incoherent with respect to the canonical basis, we prove that minimax risk is equivalent to (√(m)∨√(n))/√(_W) in the high-dimensional limit m,n→∞, where _W is the Fisher information of the noise. Crucially, we develop an efficient procedure that achieves this risk, adaptively over the noise distribution (under certain regularity assumptions). Letting = ^ --where ∈^m× r, ∈^n× r are orthogonal, and r is kept fixed as m,n→∞-- we use our method to estimate , . Standard spectral methods provide non-trivial estimates of the factors , (weak recovery) only if the singular values of are larger than (mn)^1/4(W_11)^1/2. We prove that the new approach achieves weak recovery down to the the information-theoretically optimal threshold (mn)^1/4_W^1/2.
READ FULL TEXT