Harsh Agrawal

research

∙ 05/22/2022

Housekeep: Tidying Virtual Households using Commonsense Reasoning

We introduce Housekeep, a benchmark to evaluate commonsense reasoning in...

9 Yash Kant, et al. ∙

research

∙ 04/06/2022

Simple and Effective Synthesis of Indoor 3D Scenes

We study the problem of synthesizing immersive 3D indoor scenes from one...

5 Jing Yu Koh, et al. ∙

research

∙ 10/27/2021

SOAT: A Scene- and Object-Aware Transformer for Vision-and-Language Navigation

Natural language instructions for visual navigation often use scene desc...

7 Abhinav Moudgil, et al. ∙

research

∙ 08/26/2021

The Surprising Effectiveness of Visual Odometry Techniques for Embodied PointGoal Navigation

It is fundamental for personal robots to reliably navigate to a specifie...

7 Xiaoming Zhao, et al. ∙

research

∙ 10/13/2020

Contrast and Classify: Alternate Training for Robust VQA

Recent Visual Question Answering (VQA) models have shown impressive perf...

2 Yash Kant, et al. ∙

research

∙ 07/23/2020

Spatially Aware Multimodal Transformers for TextVQA

Textual cues are essential for everyday tasks like buying groceries and ...

11 Yash Kant, et al. ∙

research

∙ 08/22/2019

Sequential Latent Spaces for Modeling the Intention During Diverse Image Captioning

Diverse and accurate vision+language modeling is an important goal to re...

12 Jyoti Aneja, et al. ∙

research

∙ 02/10/2019

EvalAI: Towards Better Evaluation Systems for AI Agents

We introduce EvalAI, an open source platform for evaluating and comparin...

24 Deshraj Yadav, et al. ∙

research

∙ 12/20/2018

nocaps: novel object captioning at scale

Image captioning models have achieved impressive results on datasets con...

46 Harsh Agrawal, et al. ∙

research

∙ 10/27/2018

Fabrik: An Online Collaborative Neural Network Editor

We present Fabrik, an online neural network editor that provides tools t...

0 Utsav Garg, et al. ∙

research

∙ 06/23/2016

Sort Story: Sorting Jumbled Images and Captions into Stories

Temporal common sense has applications in AI tasks such as QA, multi-doc...

0 Harsh Agrawal, et al. ∙

research

∙ 06/17/2016

Human Attention in Visual Question Answering: Do Humans and Deep Networks Look at the Same Regions?

We conduct large-scale studies on `human attention' in Visual Question A...

0 Abhishek Das, et al. ∙

research

∙ 06/12/2015

CloudCV: Large Scale Distributed Computer Vision as a Cloud Service

We are witnessing a proliferation of massive visual data. Unfortunately ...

0 Harsh Agrawal, et al. ∙

Harsh Agrawal

Featured Co-authors

Sign in with Google

Consider DeepAI Pro