Longitudinal Analysis of the Applicability of Program Repair on Past Commits
The applicability of program repair in the real world is a little researched topic. Existing program repair systems tend to only be tested on small bug datasets, such as Defects4J, that are not fully representative of real world projects. In this paper, we report on a longitudinal analysis of software repositories to investigate if past commits are amenable to program repair. Our key insight is to compute whether or not a commit lies in the search space of program repair systems. For this purpose, we present RSCommitDetector, which gets a Git repository as input and after performing a series of static analyses, it outputs a list of commits whose corresponding source code changes could likely be generated by notable repair systems. We call these commits the “repair-space commits”, meaning that they are considered in the search space of a repair system. Using RSCommitDetector, we conduct a study on 41,612 commits from the history of 72 Github repositories. The results of this study show that 1.77% of these commits are repair-space commits, they lie in the search space of at least one of the eight repair systems we consider. We use an original methodology to validate our approach and show that the precision and recall of RSCommitDetector are 77% and 92%, respectively. To our knowledge, this is the first study of the applicability of program repair with search space analysis.
READ FULL TEXT