Stochastic Rounding: Algorithms and Hardware Accelerator
Algorithms and a hardware accelerator for performing stochastic rounding (SR) are presented. The main goal is to augment the ARM M4F based multi-core processor SpiNNaker2 with a more flexible rounding functionality than is available in the ARM processor itself. The motivation of adding such an accelerator in hardware is based on our previous results showing improvements in numerical accuracy of ODE solvers in fixed-point arithmetic with SR, compared to standard round to nearest or bit truncation rounding modes. Furthermore, performing SR purely in software can be expensive, due to requirement of a pseudorandom number generator (PRNG), multiple masking and shifting instructions, and an addition operation. Also, saturation of the rounded values is included, since rounding is usually followed by saturation, which is especially important in fixed-point arithmetic due to a narrow dynamic range of representable values. The main intended use of the accelerator is to round fixed-point multiplier outputs, which are returned unrounded by the ARM processor in a wider fixed-point format than the arguments.
READ FULL TEXT