While large-scale pre-trained text-to-image models can synthesize divers...
Existing autoregressive models follow the two-stage generation paradigm ...
Existing vector quantization (VQ) based autoregressive models follow a
t...
Text-to-image generation aims at generating realistic images which are
s...