Discovering Domain Orders through Order Dependencies

05/28/2020
by   Reza Karegar, et al.
0

Much real-world data come with explicitly defined domain orders; e.g., lexicographic order for strings, numeric for integers, and chronological for time. Our goal is to discover implicit domain orders that we do not already know; for instance, that the order of months in the Lunar calendar is Corner < Apricot < Peach, and so on. To do so, we enhance data profiling methods by discovering implicit domain orders in data through order dependencies (ODs). We first identify tractable special cases and then proceed towards the most general case, which we prove is NP-complete. Nevertheless, we show that the general case can be effectively handled by a SAT solver. We also propose an interestingness measure to rank the discovered implicit domain orders. Finally, we report on the results of an experimental evaluation using real-world datasets.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset