Predict; Do not React for Enabling Efficient Fine Grain DVFS in GPUs

04/30/2022
by   Srikant Bharadwaj, et al.
0

With the continuous improvement of on-chip integrated voltage regulators (IVRs) and fast, adaptive frequency control, dynamic voltage-frequency scaling (DVFS) transition times have shrunk from the microsecond to the nanosecond regime, providing additional opportunities to improve energy efficiency. The key to unlocking the continued improvement in voltage-frequency circuit technology is the creation of new, smarter DVFS mechanisms that better adapt to rapid fluctuations in workload demand. It is particularly important to optimize fine-grain DVFS mechanisms for graphics processing units (GPUs) as the chips become ever more important workhorses in the datacenter. However, massive amount of thread-level parallelism in GPUs makes it uniquely difficult to determine the optimal voltage-frequency state at run-time. Existing solutions-mostly designed for single-threaded CPUs and longer time scales-fail to consider the seemingly chaotic, highly varying nature of GPU workloads at short time scales. This paper proposes a novel prediction mechanism, PCSTALL, that is tailored for emerging DVFS capabilities in GPUs and achieves near-optimal energy efficiency. Using the insights from our fine-grained workload analysis, we propose a wavefront-level program counter (PC) based DVFS mechanism that improves program behavior prediction accuracy by 32 of GPU applications at 1 microsecond DVFS time epochs. Compared to the current state-of-art, our PC-based technique achieves 19 optimized for Energy-Delay-Squared Product at 50 microsecond time epochs, reaching 32 technologies.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset