In this work, we propose Retentive Network (RetNet) as a foundation
arch...
Large pretrained language models have shown surprising In-Context Learni...
Position modeling plays a critical role in Transformers. In this paper, ...
Large language models have exhibited intriguing in-context learning
capa...