Rui Qian

research

∙ 08/19/2023

Semantics Meets Temporal Correspondence: Self-supervised Object-centric Learning in Videos

Self-supervised methods have shown remarkable progress in learning high-...

0 Rui Qian, et al. ∙

research

∙ 08/08/2023

Prune Spatio-temporal Tokens by Semantic-aware Temporal Accumulation

Transformers have become the primary backbone of the computer vision com...

0 Shuangrui Ding, et al. ∙

research

∙ 03/16/2023

Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation

Animating virtual avatars to make co-speech gestures facilitates various...

0 Lingting Zhu, et al. ∙

research

∙ 10/01/2022

Motion-inductive Self-supervised Object Discovery in Videos

In this paper, we consider the task of unsupervised object discovery in ...

7 Shuangrui Ding, et al. ∙

research

∙ 07/26/2022

Static and Dynamic Concepts for Self-supervised Video Representation Learning

In this paper, we propose a novel learning scheme for self-supervised vi...

0 Rui Qian, et al. ∙

research

∙ 07/21/2022

Exploring Fine-Grained Audiovisual Categorization with the SSW60 Dataset

We present a new benchmark dataset, Sapsucker Woods 60 (SSW60), for adva...

4 Grant Van Horn, et al. ∙

research

∙ 07/12/2022

Dual Contrastive Learning for Spatio-temporal Representation

Contrastive learning has shown promising potential in self-supervised sp...

0 Shuangrui Ding, et al. ∙

research

∙ 03/30/2022

Controllable Augmentations for Video Representation Learning

This paper focuses on self-supervised video representation learning. Mos...

11 Rui Qian, et al. ∙

research

∙ 03/24/2022

Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation

Generating speech-consistent body and gesture movements is a long-standi...

0 Xian Liu, et al. ∙

research

∙ 02/13/2022

Visual Sound Localization in the Wild by Cross-Modal Interference Erasing

The task of audio-visual sound source localization has been well studied...

5 Xian Liu, et al. ∙

research

∙ 12/22/2021

Class-aware Sounding Objects Localization via Audiovisual Correspondence

Audiovisual scenes are pervasive in our daily life. It is commonplace fo...

0 Di Hu, et al. ∙

research

∙ 09/30/2021

Motion-aware Self-supervised Video Representation Learning via Foreground-background Merging

In light of the success of contrastive learning in the image domain, cur...

0 Shuangrui Ding, et al. ∙

research

∙ 09/03/2021

Revisiting 3D ResNets for Video Recognition

A recent work from Bello shows that training and scaling strategies may ...

21 Xianzhi Du, et al. ∙

research

∙ 08/04/2021

Enhancing Self-supervised Video Representation Learning via Multi-level Feature Optimization

The crux of self-supervised video representation learning is to build ge...

6 Rui Qian, et al. ∙

research

∙ 07/10/2021

TTAN: Two-Stage Temporal Alignment Network for Few-shot Action Recognition

Few-shot action recognition aims to recognize novel action classes (quer...

3 Shuyuan Li, et al. ∙

research

∙ 04/22/2021

VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text

We present a framework for learning multimodal representations from unla...

21 Hassan Akbari, et al. ∙

research

∙ 12/13/2020

Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation

Building instance segmentation models that are data-efficient and can ha...

18 Golnaz Ghiasi, et al. ∙

research

∙ 10/12/2020

Discriminative Sounding Objects Localization via Self-supervised Audiovisual Matching

Discriminatively localizing sounding objects in cocktail-party, i.e., mi...

0 Di Hu, et al. ∙

research

∙ 08/30/2020

Finding Action Tubes with a Sparse-to-Dense Framework

The task of spatial-temporal action detection has attracted increasing a...

4 Yuxi Li, et al. ∙

research

∙ 07/13/2020

Multiple Sound Sources Localization from Coarse to Fine

How to visually localize multiple sound sources in unconstrained videos ...

0 Rui Qian, et al. ∙

research

∙ 05/09/2020

Human in Events: A Large-Scale Benchmark for Human-centric Video Analysis in Complex Events

Along with the development of the modern smart city, human-centric video...

5 Weiyao Lin, et al. ∙

research

∙ 04/07/2020

End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection

Reliable and accurate 3D object detection is a necessity for safe autono...

6 Rui Qian, et al. ∙

research

∙ 11/06/2018

Weakly Supervised Scene Parsing with Point-based Distance Metric Learning

Semantic scene parsing is suffering from the fact that pixel-level annot...

4 Rui Qian, et al. ∙

research

∙ 11/28/2017

Attentive Generative Adversarial Network for Raindrop Removal from a Single Image

Raindrops adhered to a glass window or camera lens can severely hamper t...

0 Rui Qian, et al. ∙

Rui Qian

Featured Co-authors

Sign in with Google

Consider DeepAI Pro