For robots to be useful outside labs and specialized factories we need a...
We present a novel model for Tracking Any Point (TAP) that effectively t...
We propose a novel multimodal video benchmark - the Perception Test - to...
Generic motion understanding from video involves not only tracking objec...
Much of the recent progress in 3D vision has been driven by the developm...
The recently-proposed Perceiver model obtains good results on several do...
Cryo-electron microscopy (cryo-EM) has revolutionized experimental prote...
Given new tasks with very little data–such as new classes in a
classific...
We introduce Bootstrap Your Own Latent (BYOL), a new approach to
self-su...
Simulation is an anonymous, low-bias source of data where annotation can...
Large scale deep learning excels when labeled images are abundant, yet
d...
We present a bundle-adjustment-based algorithm for recovering accurate 3...
Physical construction -- the ability to compose objects, subject to phys...
We introduce the Action Transformer model for recognizing and localizing...
Visual QA is a pivotal challenge for higher-level reasoning, requiring
u...
Attention mechanisms in biological perception are thought to select subs...
We introduce a simple baseline for action localization on the AVA datase...
We present a method for using previously-trained 'teacher' agents to
kic...
We investigate methods for combining multiple self-supervised tasks--i.e...
In a given scene, humans can often easily predict a set of immediate fut...
In just three years, Variational Autoencoders (VAEs) have emerged as one...
Building on the success of recent discriminative mid-level elements, we
...