An exact, cache-localized algorithm for the sub-quadratic convolution of hypercubes
Fast multidimensional convolution can be performed naively in quadratic time and can often be performed more efficiently via the Fourier transform; however, when the dimensionality is large, these algorithms become more challenging. A method is proposed for performing exact hypercube convolution in sub-quadratic time. The method outperforms FFTPACK, called via numpy, and FFTW, called via pyfftw) for hypercube convolution. Embeddings in hypercubes can be paired with sub-quadratic hypercube convolution method to construct sub-quadratic algorithms for variants of vector convolution.
READ FULL TEXT