Graph-Based Fuzz Testing for Deep Learning Inference Engine
Testing deep learning (DL) systems are increasingly crucial as the increasing usage of the DL system in various domains. Existing testing techniques focus on testing the quality of specific DL models but lacks attention to the core underlying inference engines(frameworks and libraries) on which all DL models depend. In this study, we designed a novel graph-based fuzz testing framework to test the DL inference engine. The proposed framework adopts an operator-level coverage based on graph theory and implements six different mutations to generate diverse DL models by exploring combinations of model structures, parameters, and data. It also employs Monte Carlo Tree Search to drive the decision processes of DL model generation. The experimental results show that: (1) guided by operator-level coverage, our approach is effective in detecting three types of undesired behaviors: models conversion failure, inference failure, output comparison failure; (2) the MCTS-based search outperforms random-based search in boosting operator-level coverage and detecting exceptions; (3) our mutants are useful to generate new valid test inputs with the original graph or subgraph, by up to a 21.6% more operator-level coverage on average and 24.3 more exceptions captured.
READ FULL TEXT