Provably-Robust Runtime Monitoring of Neuron Activation Patterns
For deep neural networks (DNNs) to be used in safety-critical autonomous driving tasks, it is desirable to monitor in operation time if the input for the DNN is similar to the data used in DNN training. While recent results in monitoring DNN activation patterns provide a sound guarantee due to building an abstraction out of the training data set, reducing false positives due to slight input perturbation has been an issue towards successfully adapting the techniques. We address this challenge by integrating formal symbolic reasoning inside the monitor construction process. The algorithm performs a sound worst-case estimate of neuron values with inputs (or features) subject to perturbation, before the abstraction function is applied to build the monitor. The provable robustness is further generalized to cases where monitoring a single neuron can use more than one bit, implying that one can record activation patterns with a fine-grained decision on the neuron value interval.
READ FULL TEXT