Machine learning models perform well on several healthcare tasks and can...
Egocentric gaze anticipation serves as a key building block for the emer...
We present ShapeClipper, a novel method that reconstructs 3D object shap...
In a noisy conversation environment such as a dinner party, people often...
Persuasion modeling is a key building block for conversational agents.
E...
The promise of Mobile Health (mHealth) is the ability to use wearable se...
A hallmark of the deep learning era for computer vision is the successfu...
We address the challenging task of Localization via Embodied Dialog (LED...
In this paper, we present the first transformer-based model to address t...
We present a novel 3D shape reconstruction method which learns to predic...
We introduce the novel problem of anticipating a time series of future h...
Attention mechanisms take an expectation of a data representation with
r...
Ecological Momentary Assessments (EMAs) are an important psychological d...
The Continuous-Time Hidden Markov Model (CT-HMM) is an attractive approa...
Given a video captured from a first person perspective and recorded in a...
In multi-object tracking, the tracker maintains in its memory the appear...
It is widely accepted that reasoning about object shape is important for...
Continual learning is known for suffering from catastrophic forgetting, ...
To understand human daily social interaction from egocentric perspective...
We present Where Are You? (WAY), a dataset of 6k dialogs in which two h...
The key challenge in single image 3D shape reconstruction is to ensure t...
We address the task of jointly determining what a person is doing and wh...
In this work, we present a method for obtaining an implicit objective
fu...
The inductive bias of a neural network is largely determined by the
arch...
We address the problem of detecting attention targets in video. Specific...
Panel count data is recurrent events data where counts of events are obs...
Inner product-based convolution has been the founding stone of convoluti...
Recent work on minimum hyperspherical energy (MHE) has demonstrated its
...
We consider the problem of online adaptation of a neural network designe...
Localizing moments in untrimmed videos via language queries is a new and...
We present a task-aware approach to synthetic data generation. Our frame...
We present an unsupervised learning approach to recover 3D human pose fr...
We address the challenging problem of learning motion representations us...
Eye contact is a crucial element of non-verbal communication that signif...
We describe a novel cross-modal embedding space for actions, named
Actio...
In this paper, we provide a modern synthesis of the classic inverse
comp...
In this paper we present a framework for combining deep learning-based r...
Automatic generation of textual video descriptions that are time-aligned...
This article presents AutoRally, a 1:5 scale robotics testbed for
autono...
Inner product-based convolution has been a central component of convolut...
Estimation of 3D motion in a dynamic scene from a temporal pair of image...
In this paper, we make an important step towards the black-box machine
t...
Estimating the head pose of a person is a crucial problem that has a lar...
Face detection is a very important task and a necessary pre-processing s...
Gaze tracking is an important technology as the system can give informat...
In this paper, we consider the problem of machine teaching, the inverse
...
Approximate Bayesian Computation (ABC) is a framework for performing
lik...
Data-driven approaches for edge detection have proven effective and achi...
In this paper we provide an extensive evaluation of fixation prediction ...