We study the problem of efficient generative inference for Transformer
m...
Large language models have been shown to achieve remarkable performance
...
Recent neural network-based language models have benefited greatly from
...
Recent results in language understanding using neural networks have requ...
Large Transformer models routinely achieve state-of-the-art results on a...
Convolutions are a fundamental building block of modern computer vision
...