Depth Pruning with Auxiliary Networks for TinyML

04/22/2022
by   Josen Daniel De Leon, et al.
31

Pruning is a neural network optimization technique that sacrifices accuracy in exchange for lower computational requirements. Pruning has been useful when working with extremely constrained environments in tinyML. Unfortunately, special hardware requirements and limited study on its effectiveness on already compact models prevent its wider adoption. Depth pruning is a form of pruning that requires no specialized hardware but suffers from a large accuracy falloff. To improve this, we propose a modification that utilizes a highly efficient auxiliary network as an effective interpreter of intermediate feature maps. Our results show a parameter reduction of 93 Wakewords (VWW) task and 28 cost of 0.65 microcontroller, our proposed method reduces the VWW model size by 4.7x and latency by 1.6x while counter intuitively gaining 1 on Cortex-M0 was also reduced by 1.2x and latency by 1.2x at the cost of 2.21 accuracy.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset