PATSQL: Efficient Synthesis of SQL Queries from Example Tables with Quick Inference of Projected Columns

10/12/2020
by   Keita Takenouchi, et al.
0

SQL is one of the most popular tools for data analysis and used by an increasing number of users without having expertise in databases. In order to help such non-experts to write correct SQL queries, several studies have proposed programming-by-example approaches. In these approaches, the user can obtain a desired query just by giving input and output (I/O) tables as an example. While existing methods support a variety of SQL features such as aggregation and nested query, they suffer a significant increase in computational cost as the scale of I/O tables increases. In this paper, we propose an efficient algorithm that synthesizes SQL queries from I/O tables. Specifically, it has strengths in both the execution time and the scale of supported tables. We adopt a sketch-based synthesis algorithm and focus on the quick inference of the columns used in the projection operator. In particular, we restrict the structures of sketches based on transformation rules in relational algebra and propagate a novel form of constraint using the output table in a top-down manner. We implemented this algorithm in our tool PATSQL and evaluated it on 118 queries from prior benchmarks and Kaggle's tutorials. As a result, PATSQL solved 72 within a second. Our tool is available at https://naist-se.github.io/patsql/ .

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset