On the computational and statistical complexity of over-parameterized matrix sensing
We consider solving the low rank matrix sensing problem with Factorized Gradient Descend (FGD) method when the true rank is unknown and over-specified, which we refer to as over-parameterized matrix sensing. If the ground truth signal ð^* ââ^d*d is of rank r, but we try to recover it using ð ð ^âĪ where ð ââ^d*k and k>r, the existing statistical analysis falls short, due to a flat local curvature of the loss function around the global maxima. By decomposing the factorized matrix ð into separate column spaces to capture the effect of extra ranks, we show that ð _t ð _t - ð^*_F^2 converges to a statistical error of ðŠĖ (k d Ï^2/n) after ðŠĖ(Ï_r/Ïâ(n/d)) number of iterations where ð _t is the output of FGD after t iterations, Ï^2 is the variance of the observation noise, Ï_r is the r-th largest eigenvalue of ð^*, and n is the number of sample. Our results, therefore, offer a comprehensive picture of the statistical and computational complexity of FGD for the over-parameterized matrix sensing problem.
READ FULL TEXT