Transformer language models such as GPT-2 are difficult to quantize beca...
While tasks could come with varying number of instances in realistic
set...
A meta-model is trained on a distribution of similar tasks such that it
...
Few-shot learning aims to build a learner that quickly generalizes to no...