Spectral Statistics of Sample Block Correlation Matrices
A fundamental concept in multivariate statistics, sample correlation matrix, is often used to infer the correlation/dependence structure among random variables, when the population mean and covariance are unknown. A natural block extension of it, sample block correlation matrix, is proposed to take on the same role, when random variables are generalized to random sub-vectors. In this paper, we establish a spectral theory of the sample block correlation matrices and apply it to group independent test and related problem, under the high-dimensional setting. More specifically, we consider a random vector of dimension p, consisting of k sub-vectors of dimension p_t's, where p_t's can vary from 1 to order p. Our primary goal is to investigate the dependence of the k sub-vectors. We construct a random matrix model called sample block correlation matrix based on n samples for this purpose. The spectral statistics of the sample block correlation matrix include the classical Wilks' statistic and Schott's statistic as special cases. It turns out that the spectral statistics do not depend on the unknown population mean and covariance. Further, under the null hypothesis that the sub-vectors are independent, the limiting behavior of the spectral statistics can be described with the aid of the Free Probability Theory. Specifically, under three different settings of possibly n-dependent k and p_t's, we show that the empirical spectral distribution of the sample block correlation matrix converges to the free Poisson binomial distribution, free Poisson distribution (Marchenko-Pastur law) and free Gaussian distribution (semicircle law), respectively. We then further derive the CLTs for the linear spectral statistics of the block correlation matrix under general setting.
READ FULL TEXT