Unsupervised performance estimation, or evaluating how well models perfo...
Conversation agents fueled by Large Language Models (LLMs) are providing...
Large-scale multi-modal training with image-text pairs imparts strong
ge...
Pre-trained vision-language (V-L) models such as CLIP have shown excelle...
Existing open-vocabulary object detectors typically enlarge their vocabu...
In the pursuit of achieving ever-increasing accuracy, large and complex
...
Recent research in self-supervised learning (SSL) has shown its capabili...
A bipartite graph consists of two disjoint vertex sets, where vertices o...
Systematic reviews, which summarize and synthesize all the current resea...