Distributed Deep Convolutional Neural Networks for the Internet-of-Things
Due to the high demand in computation and memory, deep learning solutions are mostly restricted to high-performance computing units, e.g., those present in servers, Cloud, and computing centers. In pervasive systems, e.g., those involving Internet-of-Things (IoT) technological solutions, this would require the transmission of acquired data from IoT sensors to the computing platform and wait for its output. This solution might become infeasible when remote connectivity is either unavailable or limited in bandwidth. Moreover, it introduces uncertainty in the "data production to decision making"-latency, which, in turn, might impair control loop stability if the response should be used to drive IoT actuators. In order to support a real-time recall phase directly at the IoT level, deep learning solutions must be completely rethought having in mind the constraints on memory and computation characterizing IoT units. In this paper we focus on Convolutional Neural Networks (CNNs), a specific deep learning solution for image and video classification, and introduce a methodology aiming at distributing their computation onto the units of the IoT system. We formalize such a methodology as an optimization problem where the latency between the data-gathering phase and the subsequent decision-making one is minimized. The methodology supports multiple IoT sources of data as well as multiple CNNs in execution on the same IoT system, making it a general-purpose distributed computing platform for CNN-based applications demanding autonomy, low decision-latency, and high Quality-of-Service.
READ FULL TEXT