Active Inductive Logic Programming for Code Search
Modern search techniques either cannot efficiently incorporate human feedback to refine search results or to express structural or semantic properties of desired code. The key insight of our interactive code search technique ALICE is that user feedback could be actively incorporated to allow users to easily express and refine search queries. We design a query language to model the structure and semantics of code as logic facts. Given a code example with user annotations, ALICE automatically extracts a logic query from features that are tagged as important. Users can refine the search query by labeling one or more examples as desired (positive) or irrelevant (negative). ALICE then infers a new logic query that separates the positives from negative examples via active inductive logic programming. Our comprehensive and systematic simulation experiment shows that ALICE removes a large number of false positives quickly by actively incorporating user feedback. Its search algorithm is also robust to noise and user labeling mistakes. Our choice of leveraging both positive and negative examples and the nested containment structure of selected code is effective in refining search queries. Compared with an existing technique, Critics, ALICE does not require a user to manually construct a search pattern and yet achieves comparable precision and recall with fewer search iterations on average. A case study with users shows that ALICE is easy to use and helps express complex code patterns.
READ FULL TEXT