Performance-Oriented Neural Architecture Search
Hardware-Software Co-Design is a highly successful strategy for improving performance of domain-specific computing systems. We argue for the application of the same methodology to deep learning; specifically, we propose to extend neural architecture search with information about the hardware to ensure that the model designs produced are highly efficient in addition to the typical criteria around accuracy. Using the task of keyword spotting in audio on edge computing devices, we demonstrate that our approach results in neural architecture that is not only highly accurate, but also efficiently mapped to the computing platform which will perform the inference. Using our modified neural architecture search, we demonstrate 0.88% increase in TOP-1 accuracy with 1.85× reduction in latency for keyword spotting in audio on an embedded SoC, and 1.59× on a high-end GPU.
READ FULL TEXT