Relevance Proximity Graphs for Fast Relevance Retrieval
In plenty of machine learning applications, the most relevant items for a particular query should be efficiently extracted, while the relevance function is based on a highly-nonlinear model, e.g., DNNs or GBDTs. Due to the high computational complexity of such models, exhaustive search is infeasible even for medium-scale problems. To address this issue, we introduce Relevance Proximity Graphs (RPG): an efficient non-exhaustive approach that provides a high-quality approximate solution for maximal relevance retrieval. Namely, we extend the recent similarity graphs framework to the setting, when there is no similarity measure defined on item pairs, which is a common practical use-case. By design, our approach directly maximizes off-the-shelf relevance functions and does not require any proxy auxiliary models. Via extensive experiments, we show that the developed method provides excellent retrieval accuracy while requiring only a few model computations, outperforming indirect models. We open-source our implementation as well as two large-scale datasets to support further research on relevance retrieval.
READ FULL TEXT