Consistency of Spectral Clustering on Hierarchical Stochastic Block Models
We propose a generic network model, based on the Stochastic Block Model, to study the hierarchy of communities in real-world networks, under which the connection probabilities are structured in a binary tree. Under the network model, we show that the eigenstructure of the expected unnormalized graph Laplacian reveals the community structure of the network as well as the hierarchy of communities in a recursive fashion. Inspired by the nice property of the population eigenstructure, we develop a recursive bi-partitioning algorithm that divides the network into two communities based on the Fiedler vector of the unnormalized graph Laplacian and repeats the split until a stopping rule indicates no further community structures. We prove the weak and strong consistency of our algorithm for sparse networks with the expected node degree in O(log n) order, based on newly developed theory on ℓ_2→∞ eigenspace perturbation, without knowing the total number of communities in advance. Unlike most of existing work, our theory covers multi-scale networks where the connection probabilities may differ in order of magnitude, which comprise an important class of models that are practically relevant but technically challenging to deal with. Finally we demonstrate the performance of our algorithm on synthetic data and real-world examples.
READ FULL TEXT