PRINS: Resistive CAM Processing in Storage
Near-data in-storage processing research has been gaining momentum in recent years. Typical processing-in-storage architecture places a single or several processing cores inside the storage and allows data processing without transferring it to the host CPU. Since this approach replicates von Neumann architecture inside storage, it is exposed to the problems faced by von Neumann architecture, especially the bandwidth wall. We present PRINS, a novel in-data processing-in-storage architecture based on Resistive Content Addressable Memory (RCAM). PRINS functions simultaneously as a storage and a massively parallel associative processor. PRINS alleviates the bandwidth wall faced by conventional processing-in-storage architectures by keeping the computing inside the storage arrays, thus implementing in-data, rather than near-data, processing. We show that PRINS may outperform a reference computer architecture with a bandwidth-limited external storage. The performance of PRINS Euclidean distance, dot product and histogram implementation exceeds the attainable performance of a reference architecture by up to four orders of magnitude, depending on the dataset size. The performance of PRINS SpMV may exceed the attainable performance of such reference architecture by more than two orders of magnitude.
READ FULL TEXT