Functional Linear Regression of CDFs
The estimation of cumulative distribution functions (CDF) is an important learning task with a great variety of downstream applications, e.g., risk assessments in predictions and decision making. We study functional regression of contextual CDFs where each data point is sampled from a linear combination of context dependent CDF bases. We propose estimation methods that estimate CDFs accurately everywhere. In particular, given n samples with d bases, we show estimation error upper bounds of O(√(d/n)) for fixed design, random design, and adversarial context cases. We also derive matching information theoretic lower bounds, establishing minimax optimality for CDF functional regression. To complete our study, we consider agnostic settings where there is a mismatch in the data generation process. We characterize the error of the proposed estimator in terms of the mismatched error, and show that the estimator is well-behaved under model mismatch.
READ FULL TEXT