Xiaohu Qie

research

∙ 08/16/2023

OmniZoomer: Learning to Move and Zoom in on Sphere at High-Resolution

Omnidirectional images (ODIs) have become increasingly popular, as their...

0 Zidong Cao, et al. ∙

research

∙ 04/24/2023

HOSNeRF: Dynamic Human-Object-Scene Neural Radiance Fields from a Single Video

We introduce HOSNeRF, a novel 360 free-viewpoint rendering method that r...

0 Jia-Wei Liu, et al. ∙

research

∙ 04/17/2023

MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing

Despite the success in large-scale text-to-image generation and text-con...

0 Mingdeng Cao, et al. ∙

research

∙ 03/24/2023

Accelerating Vision-Language Pretraining with Free Language Modeling

The state of the arts in vision-language pretraining (VLP) achieves exem...

2 Teng Wang, et al. ∙

research

∙ 02/16/2023

T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models

The incredible generative ability of large-scale text-to-image (T2I) mod...

0 Chong Mou, et al. ∙

research

∙ 01/17/2023

Masked Visual Reconstruction in Language Semantic Space

Both masked image modeling (MIM) and natural language supervision have f...

8 Shusheng Yang, et al. ∙

research

∙ 12/28/2022

Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models

Recent CLIP-guided 3D optimization methods, e.g., DreamFields and PureCL...

0 Jiale Xu, et al. ∙

research

∙ 12/22/2022

Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation

To reproduce the success of text-to-image (T2I) generation, recent works...

3 Jay Zhangjie Wu, et al. ∙

research

∙ 12/06/2022

Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis

Vector-Quantized (VQ-based) generative models usually consist of two bas...

6 Yuchao Gu, et al. ∙

research

∙ 11/22/2022

One for All, All for One: Learning and Transferring User Embeddings for Cross-Domain Recommendation

Cross-domain recommendation is an important method to improve recommende...

0 Chenglin Li, et al. ∙

research

∙ 10/13/2022

Tenrec: A Large-scale Multipurpose Benchmark Dataset for Recommender Systems

Existing benchmark datasets for recommender systems (RS) either are crea...

0 Guanghu Yuan, et al. ∙

research

∙ 06/22/2022

Weakly-supervised Action Localization via Hierarchical Mining

Weakly-supervised action localization aims to localize and classify acti...

0 Jia-Chang Feng, et al. ∙

research

∙ 05/31/2022

DeVRF: Fast Deformable Voxel Radiance Fields for Dynamic Scenes

Modeling dynamic scenes is important for many applications such as virtu...

0 Jia-Wei Liu, et al. ∙

research

∙ 05/19/2022

Masked Image Modeling with Denoising Contrast

Since the development of self-supervised visual representation learning ...

13 Kun Yi, et al. ∙

research

∙ 04/26/2022

MILES: Visual BERT Pre-training with Injected Language Semantics for Video-text Retrieval

Dominant pre-training work for video-text retrieval mainly adopt the "du...

1 Yuying Ge, et al. ∙

research

∙ 03/23/2022

UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection

Finding relevant moments and highlights in videos according to natural l...

0 Ye Liu, et al. ∙

research

∙ 01/13/2022

BridgeFormer: Bridging Video-text Retrieval with Multiple Choice Questions

Pre-training a model to learn transferable video-text representation for...

6 Yuying Ge, et al. ∙

Xiaohu Qie

Featured Co-authors

Sign in with Google

Consider DeepAI Pro