Consistent estimation of high-dimensional factor models when the factor number is over-estimated
A high-dimensional r-factor model for an n-dimensional vector time series is characterised by the presence of a large eigengap (increasing with n) between the r-th and the (r + 1)-th largest eigenvalues of the covariance matrix. Consequently, Principal Component Analysis (PCA) is a popular estimation method for factor models and its consistency, when r is correctly estimated, is well-established in the literature. However, various factor number estimators often suffer from the lack of an obvious eigengap in finite samples. We empirically show that they tend to over-estimate the factor number in the presence of moderate correlations in the idiosyncratic (not factor-driven) components, and further prove that over-estimation of r can result in non-negligible errors in the PCA estimators. To remedy this problem, we propose two new estimators based on capping or scaling the entries of the sample eigenvectors, which are less sensitive than the PCA estimator to the over-estimation of r without knowing the true factor number. We show both theoretically and empirically that the two estimators successfully controls for the over-estimation error, and demonstrate their good performance on macroeconomics and financial time series datasets.
READ FULL TEXT