The ever-increasing computational and storage requirements of modern
app...
Emerging deep neural network (DNN) applications require high-performance...
Meeting the staggering bandwidth requirements of today's applications
ch...
Shared L1 memory clusters are a common architectural pattern (e.g., in
G...
In this paper, we present Quark, an integer RISC-V vector processor
spec...
2.5D integration is an important technique to tackle the growing cost of...
Chips with hundreds to thousands of cores require scalable networks-on-c...
Vector architectures are gaining traction for highly efficient processin...
Modern high-performance computing architectures (Multicore, GPU, Manycor...
While parallel architectures based on clusters of Processing Elements (P...
Three-dimensional integrated circuits promise power, performance, and
fo...
A key challenge in scaling shared-L1 multi-core clusters towards many-co...
On-chip communication infrastructure is a central component of modern
sy...
In this paper, we present Ara, a 64-bit vector processor based on the ve...