Recent Diffusion Transformers (e.g., DiT) have demonstrated their powerf...
In recent years, the field of computer vision has seen significant
advan...
Diffusion models have proven to be highly effective in generating
high-q...
This paper presents DetCLIPv2, an efficient and scalable training framew...
Open-world object detection, as a more general and challenging goal, aim...
Unsupervised large-scale vision-language pre-training has shown promisin...
The state-of-the-art object detection method is complicated with various...