Covariate-dependent control limits for the detection of abnormal price changes in scanner data
Currently, large-scale sales data for consumer goods, named scanner data, are obtained by scanning the bar codes of individual products at the points of sale in retail outlets. Many national statistical offices (NSOs) attempt to use scanner data to build consumer price statistics. As in other statistical procedures, the detection of abnormal transactions in sales prices is an important step in the analysis. Two of the most popular methods for outlier detection are the quartile method and Tukey algorithm. Both methods are solely based on information about price changes and not on other covariates (e.g., sales volume or types of retail shops) that are also available from the scanner data. In this paper, we propose a new method to detect abnormal changes in price that takes into account other extra covariates, particularly sales volume. We assume that the variance of the log of the price change is a smooth function of the sales volumes and estimate it from the previously observed data. We numerically show the advantage of the new method over existing methods. We also apply the methods to real scanner data collected in weekly intervals by the Korean Chamber of Commerce and Industry between the years 2013 and 2014.
READ FULL TEXT