Change point detection in high dimensional data with U-statistics

07/18/2022
by   B. Cooper Boniece, et al.
0

We consider the problem of detecting distributional changes in a sequence of high dimensional data. Our proposed methods are nonparametric, suitable for either continuous or discrete data, and are based on weighted cumulative sums of U-statistics stemming from L_p norms. We establish the asymptotic distribution of our proposed test statistics separately in cases of weakly dependent and strongly dependent coordinates as min{N,d}→∞, where N denotes sample size and d is the dimension, and also provide sufficient conditions for consistency of the proposed test procedures under a general fixed alternative with one change point. We further assess finite sample performance of the test procedures through Monte Carlo studies, and conclude with two applications to Twitter data concerning the mentions of U.S. Governors and the frequency of tweets containing social justice keywords.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset