Epicure: Distilling Sequence Model Predictions into Patterns

08/16/2023
by   Miltiadis Allamanis, et al.
0

Most machine learning models predict a probability distribution over concrete outputs and struggle to accurately predict names over high entropy sequence distributions. Here, we explore finding abstract, high-precision patterns intrinsic to these predictions in order to make abstract predictions that usefully capture rare sequences. In this short paper, we present Epicure, a method that distils the predictions of a sequence model, such as the output of beam search, into simple patterns. Epicure maps a model's predictions into a lattice that represents increasingly more general patterns that subsume the concrete model predictions. On the tasks of predicting a descriptive name of a function given the source code of its body and detecting anomalous names given a function, we show that Epicure yields accurate naming patterns that match the ground truth more often compared to just the highest probability model prediction. For a false alarm rate of 10 compared to the best model prediction, making Epicure well-suited for scenarios that require high precision.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset