John Langford
Principal Researcher
Miscalibration of gaze tracking devices and the resulting need for repea...
Active learning is perhaps most naturally posed as an online learning
pr...
Modern decision-making systems, from robots to web recommendation engine...
Learning to control an agent from data collected offline in a rich
pixel...
This work introduces the Eigen Memory Tree (EMT), a novel online memory ...
A person walking along a city street who tries to model all aspects of t...
A central problem in sequential decision making is to develop algorithms...
Consider the problem setting of Interaction-Grounded Learning (IGL), in ...
In real-world reinforcement learning applications the learner's observat...
Large-scale machine learning systems often involve data distributed acro...
Many real-world applications of reinforcement learning (RL) require the ...
Consider a prosthetic arm, learning to adapt to its user's control signa...
We propose the ChaCha (Champion-Challengers) algorithm for making an onl...
Large software systems tune hundreds of 'constants' to optimize their ru...
We introduce a new problem setting for continuous control called the LQR...
We create a computationally tractable algorithm for contextual bandits w...
The global health threat from COVID-19 has been controlled in a number o...
We study a new form of federated learning where the clients train
person...
We present an algorithm, HOMER, for exploration and reinforcement learni...
We design a new algorithm for batch active learning with deep neural net...
We apply empirical likelihood techniques to contextual bandit policy val...
We propose a neural architecture search (NAS) algorithm, Petridish, to
i...
We study contextual bandit learning with an abstract policy class and
co...
We study the exploration problem in episodic MDPs with rich observations...
We investigate the feasibility of learning from both fully-labeled super...
We study the sample complexity of model-based reinforcement learning in
...
We design and study a Contextual Memory Tree (CMT), a learning memory
co...
We present a systematic approach for achieving fairness in a binary
clas...
We study the computational tractability of provably sample-efficient (PA...
We study and empirically optimize contextual bandit learning, exploratio...
Most contextual bandit algorithms minimize regret to the best fixed poli...
We propose to directly map raw visual observations and text input to act...
We design an active learning algorithm for cost-sensitive multiclass
cla...
This paper studies systematic exploration for reinforcement learning wit...
We create a new online reduction of multiclass classification to binary
...
This paper studies the evaluation of policies that recommend an ordered ...
We investigate active learning with access to two distinct oracles: Labe...
We propose and study a new model for reinforcement learning with rich
ob...
We develop a new active learning algorithm for the streaming setting
sat...
We demonstrate that a dependency parser can be built using a credit
assi...
We study sequential decision making in environments where rewards are on...
Methods for learning to search for structured prediction typically imita...
Can we effectively learn a nonlinear representation in time comparable t...
We introduce online learning algorithms which are independent of feature...
We consider the problem of estimating the conditional probability of a l...
We present a new algorithm for the contextual bandit learning problem, w...
We present and prove properties of a new offline policy evaluator for an...
This is an index to the papers that appear in the Proceedings of the 29t...
We show how to reduce the process of predicting general order statistics...
In evaluating prediction markets (and other crowd-prediction mechanisms)...