Efficient Computation of Quantiles over Joins

05/25/2023
by   Nikolaos Tziavelis, et al.
0

We present efficient algorithms for Quantile Join Queries, abbreviated as the median) under some ordering over the answers to a Join Query (JQ). Our goal is to avoid materializing the set of all join answers, and to achieve quasilinear time in the size of the database, regardless of the total number of answers. A recent dichotomy result rules out the existence of such an algorithm for a general family of queries and orders. Specifically, for acyclic JQs without self-joins, the problem becomes intractable for ordering by sum whenever we join more than two relations (and these joins are not trivial intersections). Moreover, even for basic ranking functions beyond sum, such as min or max over different attributes, so far it is not known whether there is any nontrivial tractable solving what we call a "pivot answer". The second subroutine partitions the space of query answers according to this pivot, and continues searching in one partition that is represented as new develop an algorithm that works for a large class of ranking functions that are appropriately monotone. The second subroutine requires a customized construction for the specific ranking function at hand. We show the benefit and generality of our approach by using it to establish several new complexity results. First, we prove the tractability of min and max for all acyclic JQs, thereby resolving the above question. Second, we extend the previous dichotomy for sum to all partial sums. Third, we handle the intractable cases of sum by devising a deterministic approximation scheme that applies to every acyclic JQ.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset