Error Analysis and Improving the Accuracy of Winograd Convolution for Deep Neural Networks
Modern deep neural networks (DNNs) spend a large amount of their execution time computing convolutions. Winograd's minimal algorithm for small convolutions can greatly reduce the number of arithmetic operations. However, a large reduction in floating point (FP) operations in these algorithms can result in poor numeric accuracy. In this paper we analyse the FP error and prove boundaries on the error. We show that the "modified" algorithm gives a significantly better accuracy of the result. We propose several methods for reducing FP error of these algorithms. Minimal convolution algorithms depend on the selection of several numeric points that have a large impact on the accuracy of the result. We propose a canonical evaluation ordering that both reduces FP error and the size of the search space based on Huffman coding. We study point selection experimentally, and find empirically good points. We also identify the main factors that associated with sets of points that result in a low error. In addition, we explore other methods to reduce FP error, including mixed-precision convolution, and pairwise addition across DNN channels. Using our methods we can significantly reduce FP error for a given block size, which allows larger block sizes and reduced computation.
READ FULL TEXT