Non-Parametric Estimation of Manifolds from Noisy Data
A common observation in data-driven applications is that high dimensional data has a low intrinsic dimension, at least locally. In this work, we consider the problem of estimating a d dimensional sub-manifold of ℝ^D from a finite set of noisy samples. Assuming that the data was sampled uniformly from a tubular neighborhood of ℳ∈𝒞^k, a compact manifold without boundary, we present an algorithm that takes a point r from the tubular neighborhood and outputs p̂_n∈ℝ^D, and T_p̂_nℳ an element in the Grassmanian Gr(d, D). We prove that as the number of samples n→∞ the point p̂_n converges to p∈ℳ and T_p̂_nℳ converges to T_pℳ (the tangent space at that point) with high probability. Furthermore, we show that the estimation yields asymptotic rates of convergence of n^-k/2k + d for the point estimation and n^-k-1/2k + d for the estimation of the tangent space. These rates are known to be optimal for the case of function estimation.
READ FULL TEXT