The art of BART: On flexibility of Bayesian forests
Considerable effort has been directed to developing asymptotically minimax procedures in problems of recovering functions and densities. These methods often rely on somewhat arbitrary and restrictive assumptions such as isotropy or spatial homogeneity. This work enhances theoretical understanding of Bayesian forests (including BART) under substantially relaxed smoothness assumptions. In particular, we provide a comprehensive study of asymptotic optimality and posterior contraction of Bayesian forests when the regression function has anisotropic smoothness that possibly varies over the function domain. We introduce a new class of sparse piecewise heterogeneous anisotropic Hölder functions and derive their minimax rate of estimation in high-dimensional scenarios under the L_2 loss. Next, we find that the default Bayesian CART prior, coupled with a subset selection prior for sparse estimation in high-dimensional scenarios, adapts to unknown heterogeneous smoothness and sparsity. These results show that Bayesian forests are uniquely suited for more general estimation problems which would render other default machine learning tools, such as Gaussian processes, suboptimal. Beyond nonparametric regression, we also show that Bayesian forests can be successfully applied to many other problems including density estimation and binary classification.
READ FULL TEXT