Depth Pruning with Auxiliary Networks for TinyML

04/22/2022

∙

Pruning is a neural network optimization technique that sacrifices accuracy in exchange for lower computational requirements. Pruning has been useful when working with extremely constrained environments in tinyML. Unfortunately, special hardware requirements and limited study on its effectiveness on already compact models prevent its wider adoption. Depth pruning is a form of pruning that requires no specialized hardware but suffers from a large accuracy falloff. To improve this, we propose a modification that utilizes a highly efficient auxiliary network as an effective interpreter of intermediate feature maps. Our results show a parameter reduction of 93 Wakewords (VWW) task and 28 cost of 0.65 microcontroller, our proposed method reduces the VWW model size by 4.7x and latency by 1.6x while counter intuitively gaining 1 on Cortex-M0 was also reduced by 1.2x and latency by 1.2x at the cost of 2.21 accuracy.

READ FULL TEXT

Depth Pruning with Auxiliary Networks for TinyML

Sign in with Google

Consider DeepAI Pro