Yuta Nakashima

research

∙ 04/20/2023

Not Only Generative Art: Stable Diffusion for Content-Style Disentanglement in Art Analysis

The duality of content and style is inherent to the nature of art. For h...

3 Yankun Wu, et al. ∙

research

∙ 04/20/2023

Learning Bottleneck Concepts in Image Classification

Interpreting and explaining the behavior of deep neural networks is crit...

0 Bowen Wang, et al. ∙

research

∙ 04/07/2023

Model-Agnostic Gender Debiased Image Captioning

Image captioning models are known to perpetuate and amplify harmful soci...

8 Yusuke Hirota, et al. ∙

research

∙ 04/06/2023

Uncurated Image-Text Datasets: Shedding Light on Demographic Bias

The increasing tendency to collect large and uncurated datasets to train...

2 Noa Garcia, et al. ∙

research

∙ 04/04/2023

Toward Verifiable and Reproducible Human Evaluation for Text-to-Image Generation

Human evaluation is critical for validating the performance of text-to-i...

0 Mayu Otani, et al. ∙

research

∙ 01/31/2023

Inference Time Evidences of Adversarial Attacks for Forensic on Transformers

Vision Transformers (ViTs) are becoming a very popular paradigm for visi...

0 Hugo Lemarchant, et al. ∙

research

∙ 11/18/2022

Contrastive Losses Are Natural Criteria for Unsupervised Video Summarization

Video summarization aims to select the most informative subset of frames...

0 Zongshang Pang, et al. ∙

research

∙ 10/13/2022

Deep Gesture Generation for Social Robots Using Type-Specific Libraries

Body language such as conversational gesture is a powerful way to ease c...

0 Hitoshi Teshima, et al. ∙

research

∙ 08/23/2022

Learning More May Not Be Better: Knowledge Transferability in Vision and Language Tasks

Is more data always better to train vision-and-language models? We study...

6 Tianwei Chen, et al. ∙

research

∙ 05/17/2022

Gender and Racial Bias in Visual Question Answering Datasets

Vision-and-language tasks have increasingly drawn more attention as a me...

15 Yusuke Hirota, et al. ∙

research

∙ 03/30/2022

AxIoU: An Axiomatically Justified Measure for Video Moment Retrieval

Evaluation measures have a crucial impact on the direction of research. ...

0 Riku Togashi, et al. ∙

research

∙ 03/29/2022

Quantifying Societal Bias Amplification in Image Captioning

We study societal bias amplification in image captioning. Image captioni...

1 Yusuke Hirota, et al. ∙

research

∙ 03/28/2022

Optimal Correction Cost for Object Detection Evaluation

Mean Average Precision (mAP) is the primary evaluation measure for objec...

0 Mayu Otani, et al. ∙

research

∙ 10/26/2021

Transferring Domain-Agnostic Knowledge in Video Question Answering

Video question answering (VideoQA) is designed to answer a given questio...

0 Tianran Wu, et al. ∙

research

∙ 09/13/2021

Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation

Have you ever looked at a painting and wondered what is the story behind...

0 Zechen Bai, et al. ∙

research

∙ 09/02/2021

Built Year Prediction from Buddha Face with Heterogeneous Labels

Buddha statues are a part of human culture, especially of the Asia area,...

1 Yiming Qian, et al. ∙

research

∙ 07/07/2021

PoseRN: A 2D pose refinement network for bias-free multi-view 3D human pose estimation

We propose a new 2D pose refinement network that learns to predict the h...

0 Akihiko Sayo, et al. ∙

research

∙ 06/25/2021

A Picture May Be Worth a Hundred Words for Visual Question Answering

How far can we go with textual representations for understanding picture...

1 Yusuke Hirota, et al. ∙

research

∙ 05/25/2021

GCNBoost: Artwork Classification by Label Propagation through a Knowledge Graph

The rise of digitization of cultural documents offers large-scale conten...

0 Cheikh Brahim El Vaigh, et al. ∙

research

∙ 01/28/2021

Development of a Vertex Finding Algorithm using Recurrent Neural Network

Deep learning is a rapidly-evolving technology with possibility to signi...

0 Kiichi Goto, et al. ∙

research

∙ 01/14/2021

Understanding the Role of Scene Graphs in Visual Question Answering

Visual Question Answering (VQA) is of tremendous interest to the researc...

0 Vinay Damodaran, et al. ∙

research

∙ 11/25/2020

Match Them Up: Visually Explainable Few-shot Image Classification

Few-shot learning (FSL) approaches are usually based on an assumption th...

9 Bowen Wang, et al. ∙

research

∙ 11/07/2020

Grading the Severity of Arteriolosclerosis from Retinal Arterio-venous Crossing Patterns

The status of retinal arteriovenous crossing is of great significance fo...

10 Liangzhi Li, et al. ∙

research

∙ 10/19/2020

Noisy-LSTM: Improving Temporal Awareness for Video Semantic Segmentation

Semantic video segmentation is a key challenge for various applications....

19 Bowen Wang, et al. ∙

research

∙ 10/11/2020

Constructing a Visual Relationship Authenticity Dataset

A visual relationship denotes a relationship between two objects in an i...

0 Chenhui Chu, et al. ∙

research

∙ 09/30/2020

Demographic Influences on Contemporary Art with Unsupervised Style Embeddings

Computational art analysis has, through its reliance on classification t...

1 Nikolai Huckle, et al. ∙

research

∙ 09/14/2020

SCOUTER: Slot Attention-based Classifier for Explainable Image Recognition

Explainable artificial intelligence is gaining attention. However, most ...

48 Liangzhi Li, et al. ∙

research

∙ 09/01/2020

Uncovering Hidden Challenges in Query-Based Video Moment Retrieval

The query-based moment retrieval is a problem of localising a specific c...

4 Mayu Otani, et al. ∙

research

∙ 08/28/2020

A Dataset and Baselines for Visual Question Answering on Art

Answering questions related to art pieces (paintings) is a difficult tas...

20 Noa Garcia, et al. ∙

research

∙ 07/22/2020

Depthwise Spatio-Temporal STFT Convolutional Neural Networks for Human Action Recognition

Conventional 3D convolutional neural networks (CNNs) are computationally...

5 Sudhakar Kumawat, et al. ∙

research

∙ 07/17/2020

Knowledge-Based Video Question Answering with Unsupervised Scene Descriptions

To understand movies, humans constantly reason over the dialogues and ac...

14 Noa Garcia, et al. ∙

research

∙ 05/27/2020

Joint Learning of Vessel Segmentation and Artery/Vein Classification with Post-processing

Retinal imaging serves as a valuable tool for diagnosis of various disea...

3 Liangzhi Li, et al. ∙

research

∙ 04/22/2020

Yoga-82: A New Dataset for Fine-grained Classification of Human Poses

Human pose estimation is a well-known problem in computer vision to loca...

5 Manisha Verma, et al. ∙

research

∙ 04/17/2020

Knowledge-Based Visual Question Answering in Videos

We propose a novel video understanding task by fusing knowledge-based an...

12 Noa Garcia, et al. ∙

research

∙ 12/12/2019

IterNet: Retinal Image Segmentation Utilizing Structural Redundancy in Vessel Networks

Retinal vessel segmentation is of great interest for diagnosis of retina...

38 Liangzhi Li, et al. ∙

research

∙ 10/23/2019

KnowIT VQA: Answering Knowledge-Based Questions about Videos

We propose a novel video understanding task by fusing knowledge-based an...

9 Noa Garcia, et al. ∙

research

∙ 09/17/2019

BUDA.ART: A Multimodal Content-Based Analysis and Retrieval System for Buddha Statues

We introduce BUDA.ART, a system designed to assist researchers in Art Hi...

0 Benjamin Renoust, et al. ∙

research

∙ 09/17/2019

Historical and Modern Features for Buddha Statue Classification

While Buddhism has spread along the Silk Roads, many pieces of art have ...

0 Benjamin Renoust, et al. ∙

research

∙ 04/24/2019

Understanding Art through Multi-Modal Retrieval in Paintings

In computer vision, visual arts are often studied from a purely aestheti...

0 Noa Garcia, et al. ∙

research

∙ 04/10/2019

Context-Aware Embeddings for Automatic Art Analysis

Automatic art analysis aims to classify and retrieve artistic representa...

0 Noa Garcia, et al. ∙

research

∙ 03/27/2019

Rethinking the Evaluation of Video Summaries

Video summarization is a technique to create a short skim of the origina...

0 Mayu Otani, et al. ∙

research

∙ 07/07/2018

Representing a Partially Observed Non-Rigid 3D Human Using Eigen-Texture and Eigen-Deformation

Reconstruction of the shape and motion of humans from RGB-D is a challen...

0 Ryosuke Kimura, et al. ∙

research

∙ 06/12/2018

iParaphrasing: Extracting Visually Grounded Paraphrases via an Image

A paraphrase is a restatement of the meaning of a text in other words. P...

0 Chenhui Chu, et al. ∙

research

∙ 09/25/2017

Summarization of User-Generated Sports Video by Using Deep Action Recognition Features

Automatically generating a summary of sports video poses the challenge o...

0 Antonio Tejero-de-Pablos, et al. ∙

research

∙ 09/28/2016

Video Summarization using Deep Semantic Features

This paper presents a video summarization technique for an Internet vide...

0 Mayu Otani, et al. ∙

research

∙ 08/08/2016

Learning Joint Representations of Videos and Sentences with Web Image Search

Our objective is video retrieval based on natural language queries. In a...

0 Mayu Otani, et al. ∙

Yuta Nakashima

Featured Co-authors

Sign in with Google

Consider DeepAI Pro