Xihui Liu

research

∙ 09/18/2023

Object2Scene: Putting Objects in Context for Open-Vocabulary 3D Detection

Point cloud-based open-vocabulary 3D object detection aims to detect 3D ...

0 Chenming Zhu, et al. ∙

research

∙ 07/12/2023

T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional Text-to-image Generation

Despite the stunning ability to generate high-quality images by recent t...

0 Kaiyi Huang, et al. ∙

research

∙ 06/19/2023

UniG3D: A Unified 3D Object Generation Dataset

The field of generative AI has a transformative impact on various areas,...

0 Qinghong Sun, et al. ∙

research

∙ 06/06/2023

SAM3D: Segment Anything in 3D Scenes

In this work, we propose SAM3D, a novel framework that is able to predic...

0 Yunhan Yang, et al. ∙

research

∙ 05/23/2023

TVTSv2: Learning Out-of-the-box Spatiotemporal Visual Representations at Scale

The ultimate goal for foundation models is realizing task-agnostic, i.e....

3 Ziyun Zeng, et al. ∙

research

∙ 04/25/2023

Seeing is not always believing: A Quantitative Study on Human Perception of AI-Generated Images

Photos serve as a way for humans to record what they experience in their...

0 zeyu-lu, et al. ∙

research

∙ 04/24/2023

Hierarchical Diffusion Autoencoders and Disentangled Image Manipulation

Diffusion models have attained impressive visual quality for image synth...

0 zeyu-lu, et al. ∙

research

∙ 03/30/2023

DDP: Diffusion Model for Dense Visual Prediction

We propose a simple, efficient, yet powerful framework for dense visual ...

0 Yuanfeng Ji, et al. ∙

research

∙ 12/01/2022

Shape-Guided Diffusion with Inside-Outside Attention

Shape can specify key object constraints, yet existing text-to-image dif...

0 Dong Huk Park, et al. ∙

research

∙ 10/11/2022

Point Transformer V2: Grouped Vector Attention and Partition-based Pooling

As a pioneering work exploring transformer architecture for 3D point clo...

0 Xiaoyang Wu, et al. ∙

research

∙ 07/07/2022

Back to the Source: Diffusion-Driven Test-Time Adaptation

Test-time adaptation harnesses test inputs to improve the accuracy of a ...

0 Jin Gao, et al. ∙

research

∙ 06/22/2022

The ArtBench Dataset: Benchmarking Generative Models with Artworks

We introduce ArtBench-10, the first class-balanced, high-quality, cleanl...

0 Peiyuan Liao, et al. ∙

research

∙ 04/26/2022

MILES: Visual BERT Pre-training with Injected Language Semantics for Video-text Retrieval

Dominant pre-training work for video-text retrieval mainly adopt the "du...

1 Yuying Ge, et al. ∙

research

∙ 01/13/2022

BridgeFormer: Bridging Video-text Retrieval with Multiple Choice Questions

Pre-training a model to learn transferable video-text representation for...

6 Yuying Ge, et al. ∙

research

∙ 08/04/2020

Open-Edit: Open-Domain Image Manipulation with Open-Vocabulary Instructions

We propose a novel algorithm, named Open-Edit, which is the first attemp...

13 Xihui Liu, et al. ∙

research

∙ 10/15/2019

Learning to Predict Layout-to-image Conditional Convolutions for Semantic Image Synthesis

Semantic image synthesis aims at generating photorealistic images from s...

33 Xihui Liu, et al. ∙

research

∙ 09/12/2019

CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval

Text-image cross-modal retrieval is a challenging task in the field of l...

0 Zihao Wang, et al. ∙

research

∙ 03/03/2019

Improving Referring Expression Grounding with Cross-modal Attention-guided Erasing

Referring expression grounding aims at locating certain objects or perso...

0 Xihui Liu, et al. ∙

research

∙ 08/28/2018

Localization Guided Learning for Pedestrian Attribute Recognition

Pedestrian attribute recognition has attracted many attentions due to it...

0 Pengze Liu, et al. ∙

research

∙ 08/05/2018

Improving Deep Visual Representation for Person Re-identification by Global and Local Image-language Association

Person re-identification is an important task that requires learning dis...

0 Dapeng Chen, et al. ∙

research

∙ 03/22/2018

Show, Tell and Discriminate: Image Captioning by Self-retrieval with Partially Labeled Data

The aim of image captioning is to generate similar captions by machine a...

0 Xihui Liu, et al. ∙

research

∙ 09/28/2017

HydraPlus-Net: Attentive Deep Features for Pedestrian Analysis

Pedestrian analysis plays a vital role in intelligent video surveillance...

0 Xihui Liu, et al. ∙

research

∙ 02/21/2017

Object Detection in Videos with Tubelet Proposal Networks

Object detection in videos has drawn increasing attention recently with ...

0 Kai Kang, et al. ∙

Xihui Liu

Featured Co-authors

Sign in with Google

Consider DeepAI Pro