Yingwei Pan

research

∙ 03/23/2023

Transforming Radiance Field with Lipschitz Network for Photorealistic 3D Scene Stylization

Recent advances in 3D scene representation and novel view synthesis have...

0 Zicheng Zhang, et al. ∙

research

∙ 03/13/2023

Modality-Agnostic Debiasing for Single Domain Generalization

Deep neural networks (DNNs) usually fail to generalize well to outside o...

0 Sanqing Qu, et al. ∙

research

∙ 12/06/2022

Semantic-Conditional Diffusion Networks for Image Captioning

Recent advances on text-to-image generation have witnessed the rise of d...

0 Jianjie Luo, et al. ∙

research

∙ 11/15/2022

Dynamic Temporal Filtering in Video Models

Video temporal dynamics is conventionally modeled with 3D spatial-tempor...

0 Fuchen Long, et al. ∙

research

∙ 11/15/2022

SPE-Net: Boosting Point Cloud Analysis via Rotation Robustness Enhancement

In this paper, we propose a novel deep architecture tailored for 3D poin...

0 Zhaofan Qiu, et al. ∙

research

∙ 11/15/2022

3D Cascade RCNN: High Quality Object Detection in Point Clouds

Recent progress on 2D object detection has featured Cascade RCNN, which ...

0 Qi Cai, et al. ∙

research

∙ 09/26/2022

Out-of-Distribution Detection with Hilbert-Schmidt Independence Optimization

Outlier detection tasks have been playing a critical role in AI safety. ...

16 Jingyang Lin, et al. ∙

research

∙ 07/11/2022

Wave-ViT: Unifying Wavelet and Transformers for Visual Representation Learning

Multi-scale Vision Transformer (ViT) has emerged as a powerful backbone ...

0 Ting Yao, et al. ∙

research

∙ 07/11/2022

Dual Vision Transformer

Prior works have proposed several strategies to reduce the computational...

0 Ting Yao, et al. ∙

research

∙ 06/14/2022

Stand-Alone Inter-Frame Attention in Video Models

Motion, as the uniqueness of a video, has been critical to the developme...

0 Fuchen Long, et al. ∙

research

∙ 06/14/2022

Comprehending and Ordering Semantics for Image Captioning

Comprehending the rich semantics in an image and ordering them in lingui...

0 Yehao Li, et al. ∙

research

∙ 06/13/2022

Exploring Structure-aware Transformer over Interaction Proposals for Human-Object Interaction Detection

Recent high-performing Human-Object Interaction (HOI) detection techniqu...

0 Yong Zhang, et al. ∙

research

∙ 06/13/2022

Silver-Bullet-3D at ManiSkill 2021: Learning-from-Demonstrations and Heuristic Rule-based Methods for Object Manipulation

This paper presents an overview and comparative analysis of our systems ...

3 Yingwei Pan, et al. ∙

research

∙ 01/11/2022

Representing Videos as Discriminative Sub-graphs for Action Recognition

Human actions are typically of combinatorial structures or patterns, i.e...

7 Dong Li, et al. ∙

research

∙ 01/11/2022

Uni-EDEN: Universal Encoder-Decoder Network by Multi-Granular Vision-Language Pre-training

Vision-language pre-training has been an emerging and fast-developing re...

0 Yehao Li, et al. ∙

research

∙ 01/11/2022

Smart Director: An Event-Driven Directing System for Live Broadcasting

Live video broadcasting normally requires a multitude of skills and expe...

6 Yingwei Pan, et al. ∙

research

∙ 12/15/2021

Improving Self-supervised Learning with Automated Unsupervised Outlier Arbitration

Our work reveals a structured shortcoming of the existing mainstream sel...

0 Yu Wang, et al. ∙

research

∙ 12/14/2021

A Style and Semantic Memory Mechanism for Domain Generalization

Mainstream state-of-the-art domain generalization algorithms tend to pri...

0 Yang Chen, et al. ∙

research

∙ 12/14/2021

Transferrable Contrastive Learning for Visual Domain Adaptation

Self-supervised learning (SSL) has recently become the favorite among fe...

0 Yang Chen, et al. ∙

research

∙ 12/14/2021

CoCo-BERT: Improving Video-Language Pre-training with Contrastive Cross-modal Matching and Denoising

BERT-type structure has led to the revolution of vision-language pre-tra...

0 Jianjie Luo, et al. ∙

research

∙ 12/14/2021

CORE-Text: Improving Scene Text Detection with Contrastive Relational Reasoning

Localizing text instances in natural scenes is regarded as a fundamental...

0 Jingyang Lin, et al. ∙

research

∙ 08/18/2021

X-modaler: A Versatile and High-performance Codebase for Cross-modal Analytics

With the rise and development of deep learning over the past decade, the...

0 Yehao Li, et al. ∙

research

∙ 08/05/2021

A Low Rank Promoting Prior for Unsupervised Contrastive Learning

Unsupervised learning is just at a tipping point where it could really t...

2 Yu Wang, et al. ∙

research

∙ 07/26/2021

Contextual Transformer Networks for Visual Recognition

Transformer with self-attention has led to the revolutionizing of natura...

0 Yehao Li, et al. ∙

research

∙ 01/27/2021

Scheduled Sampling in Vision-Language Pretraining with Decoupled Encoder-Decoder Network

Despite having impressive vision-language (VL) pretraining with BERT-bas...

0 Yehao Li, et al. ∙

research

∙ 09/30/2020

Joint Contrastive Learning with Infinite Possibilities

This paper explores useful modifications of the recent development in co...

7 Qi Cai, et al. ∙

research

∙ 08/03/2020

SeCo: Exploring Sequence Supervision for Unsupervised Representation Learning

A steady momentum of innovations and breakthroughs has convincingly push...

0 Ting Yao, et al. ∙

research

∙ 07/07/2020

Single Shot Video Object Detector

Single shot detectors that are potentially faster and simpler than two-s...

0 Jiajun Deng, et al. ∙

research

∙ 07/05/2020

Auto-captions on GIF: A Large-scale Video-sentence Dataset for Vision-language Pre-training

In this work, we present Auto-captions on GIF, which is a new large-scal...

11 Yingwei Pan, et al. ∙

research

∙ 06/11/2020

Learning a Unified Sample Weighting Network for Object Detection

Region sampling or weighting is significantly important to the success o...

10 Qi Cai, et al. ∙

research

∙ 06/11/2020

Exploring Category-Agnostic Clusters for Open-Set Domain Adaptation

Unsupervised domain adaptation has received significant attention in rec...

0 Yingwei Pan, et al. ∙

research

∙ 03/31/2020

X-Linear Attention Networks for Image Captioning

Recent progress on fine-grained visual recognition and visual question a...

0 Yingwei Pan, et al. ∙

research

∙ 10/08/2019

Multi-Source Domain Adaptation and Semi-Supervised Domain Adaptation with Focus on Visual Domain Adaptation Challenge 2019

This notebook paper presents an overview and comparative analysis of our...

0 Yingwei Pan, et al. ∙

research

∙ 09/09/2019

Hierarchy Parsing for Image Captioning

It is always well believed that parsing an image into constituent visual...

0 Ting Yao, et al. ∙

research

∙ 09/09/2019

Deep Metric Learning with Density Adaptivity

The problem of distance metric learning is mostly considered from the pe...

13 Yehao Li, et al. ∙

research

∙ 08/26/2019

Mocycle-GAN: Unpaired Video-to-Video Translation

Unsupervised image-to-image translation is the task of translating an im...

19 Yang Chen, et al. ∙

research

∙ 08/26/2019

Relation Distillation Networks for Video Object Detection

It has been well recognized that modeling object-to-object relations wou...

0 Jiajun Deng, et al. ∙

research

∙ 08/16/2019

daBNN: A Super Fast Inference Framework for Binary Neural Networks on ARM devices

It is always well believed that Binary Neural Networks (BNNs) could dras...

0 Jianhao Zhang, et al. ∙

research

∙ 08/01/2019

Convolutional Auto-encoding of Sentence Topics for Image Paragraph Generation

Image paragraph generation is the task of producing a coherent story (us...

1 Jing Wang, et al. ∙

research

∙ 06/20/2019

vireoJD-MM at Activity Detection in Extended Videos

This notebook paper presents an overview and comparative analysis of our...

0 Fuchen Long, et al. ∙

research

∙ 06/14/2019

Trimmed Action Recognition, Dense-Captioning Events in Videos, and Spatio-temporal Action Localization with Focus on ActivityNet Challenge 2019

This notebook paper presents an overview and comparative analysis of our...

0 Zhaofan Qiu, et al. ∙

research

∙ 05/03/2019

Temporal Deformable Convolutional Encoder-Decoder Networks for Video Captioning

It is well believed that video captioning is a fundamental but challengi...

0 Jingwen Chen, et al. ∙

research

∙ 04/25/2019

Pointing Novel Objects in Image Captioning

Image captioning has received significant attention with remarkable impr...

0 Yehao Li, et al. ∙

research

∙ 04/25/2019

Exploring Object Relation in Mean Teacher for Cross-Domain Detection

Rendering synthetic data (e.g., 3D CAD-rendered images) to generate anno...

0 Qi Cai, et al. ∙

research

∙ 04/25/2019

Transferrable Prototypical Networks for Unsupervised Domain Adaptation

In this paper, we introduce a new idea for unsupervised domain adaptatio...

0 Yingwei Pan, et al. ∙

research

∙ 09/19/2018

Exploring Visual Relationship for Image Captioning

It is always well believed that modeling relationships between objects w...

0 Ting Yao, et al. ∙

research

∙ 04/23/2018

Memory Matching Networks for One-Shot Image Recognition

In this paper, we introduce the new ideas of augmenting Convolutional Ne...

0 Qi Cai, et al. ∙

research

∙ 04/23/2018

Deep Semantic Hashing with Generative Adversarial Networks

Hashing has been a widely-adopted technique for nearest neighbor search ...

0 Zhaofan Qiu, et al. ∙

research

∙ 04/23/2018

Jointly Localizing and Describing Events for Dense Video Captioning

Automatically describing a video with natural language is regarded as a ...

0 Yehao Li, et al. ∙

research

∙ 04/23/2018

To Create What You Tell: Generating Videos from Captions

We are creating multimedia contents everyday and everywhere. While autom...

0 Yingwei Pan, et al. ∙

Yingwei Pan

Featured Co-authors

Sign in with Google

Consider DeepAI Pro