Multi-source Deep Gaussian Process Kernel Learning

02/07/2020
by   Chi-Ken Lu, et al.
56

For many problems, relevant data are plentiful but explicit knowledge is not. Predictions about target variables may be informed by data sources that are noisy but plentiful, or data which the target variable is merely some function of. Intrepretable and flexible machine learning methods capable of fusing data across sources are lacking. We generalize the Deep Gaussian Processes so that GPs in intermediate layers can represent the posterior distribution summarizing the data from a related source. We model the prior-posterior stacking DGP with a single GP. The exact second moment of DGP is calculated analytically, and is taken as the kernel function for GP. The result is a kernel that captures effective correlation through function composition, reflects the structure of the observations from other data sources, and can be used to inform prediction based on limited direct observations. Therefore, the approximation of prior-posterior DGP can be considered a novel kernel composition which blends the kernels in different layers and have explicit dependence on the data. We consider two synthetic multi-source prediction problems: a) predicting a target variable that is merely a function of the source data and b) predicting noise-free data using a kernel trained on noisy data. Our method produces better prediction and tighter uncertainty on the synthetic data when comparing with standard GP and other DGP method, suggesting that our data-informed approximate DGPs are a powerful tool for integrating data across sources.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset