Tight Lower Bound for Comparison-Based Quantile Summaries

05/09/2019
by   Graham Cormode, et al.
0

Quantiles, such as the median or percentiles, provide concise and useful information about the distribution of a collection of items, drawn from a linearly ordered universe. We study data structures, called quantile summaries, which keep track of all quantiles, up to an error of at most ε. That is, an ε-approximate quantile summary first processes a stream of items and then, given any quantile query 0<ϕ< 1, returns an item from the stream, which is a ϕ'-quantile for some ϕ' = ϕ±ε. We focus on comparison-based quantile summaries that can only compare two items and are otherwise completely oblivious of the universe. The best such deterministic quantile summary to date, by Greenwald and Khanna (ACM SIGMOD '01), stores at most O(1/ε·ε N) items, where N is the number of items in the stream. We prove that this space bound is optimal by providing a matching lower bound. Our result thus rules out the possibility of constructing a deterministic comparison-based quantile summary in space f(ε)· o( N), for any function f that does not depend on N. A consequence of our results is also to show a lower bound for randomized algorithms.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset