Convergence of Graph Laplacian with kNN Self-tuned Kernels
Kernelized Gram matrix W constructed from data points {x_i}_i=1^N as W_ij= k_0( x_i - x_j ^2/σ^2 ) is widely used in graph-based geometric data analysis and unsupervised learning. An important question is how to choose the kernel bandwidth σ, and a common practice called self-tuned kernel adaptively sets a σ_i at each point x_i by the k-nearest neighbor (kNN) distance. When x_i's are sampled from a d-dimensional manifold embedded in a possibly high-dimensional space, unlike with fixed-bandwidth kernels, theoretical results of graph Laplacian convergence with self-tuned kernels, however, have been incomplete. This paper proves the convergence of graph Laplacian operator L_N to manifold (weighted-)Laplacian for a new family of kNN self-tuned kernels W^(α)_ij = k_0( x_i - x_j ^2/ϵρ̂(x_i) ρ̂(x_j))/ρ̂(x_i)^αρ̂(x_j)^α, where ρ̂ is the estimated bandwidth function by kNN, and the limiting operator is also parametrized by α. When α = 1, the limiting operator is the weighted manifold Laplacian Δ_p. Specifically, we prove the pointwise convergence of L_N f and convergence of the graph Dirichlet form with rates. Our analysis is based on first establishing a C^0 consistency for ρ̂ which bounds the relative estimation error |ρ̂ - ρ̅|/ρ̅ uniformly with high probability, where ρ̅ = p^-1/d, and p is the data density function. Our theoretical results reveal the advantage of self-tuned kernel over fixed-bandwidth kernel via smaller variance error in low-density regions. In the algorithm, no prior knowledge of d or data density is needed. The theoretical results are supported by numerical experiments.
READ FULL TEXT