HNMTP Conv: Optimize Convolution Algorithm for Single-Image Convolution Neural Network Inference on Mobile GPUs

09/06/2019
by   Zhuoran Ji, et al.
0

Convolution neural networks are widely used for mobile applications. However, GPU convolution algorithms are designed for mini-batch neural network training, the single-image convolution neural network inference algorithm on mobile GPUs is not well-studied. After discussing the usage difference and examining the existing convolution algorithms, we proposed the HNTMP convolution algorithm. The HNTMP convolution algorithm achieves 14.6 × speedup than the most popular im2col convolution algorithm, and 2.1 × speedup than the fastest existing convolution algorithm (direct convolution) as far as we know.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset