Image classification has improved with the development of training
techn...
Neural Architecture Search (NAS) aims to automatically excavate the opti...
Supervised learning of image classifiers distills human knowledge into a...
In this paper, we introduce a novel learning scheme named weakly
semi-su...
In this paper, we aim to design a quantitative similarity function betwe...
The favorable performance of Vision Transformers (ViTs) is often attribu...
Understanding temporal dynamics of video is an essential aspect of learn...
Active domain adaptation (ADA) studies have mainly addressed query selec...
Recent self-supervised video representation learning methods focus on
ma...
In Neural Architecture Search (NAS), reducing the cost of architecture
e...
Trainable layers such as convolutional building blocks are the standard
...
Understanding document images (e.g., invoices) has been an important res...
Utilizing vicinal space between the source and target domains is one of ...
Vision Transformer (ViT) extends the application range of transformers f...
ImageNet has been arguably the most popular image classification benchma...
State-of-the-art video action classifiers often suffer from overfitting....
This paper addresses representational bottleneck in a network and propos...
Normalization techniques, such as batch normalization (BN), have led to
...
Despite apparent human-level performances of deep neural networks (DNN),...
In this paper, we propose a new multi-scale face detector having an extr...
Regional dropout strategies have been proposed to enhance the performanc...
Scene text detection methods based on neural networks have emerged recen...
Many new proposals for scene text recognition (STR) models have been
int...
The semantic segmentation requires a lot of computational cost. The dila...
Deep convolutional neural networks (DCNNs) have shown remarkable perform...