Marc Abeille

research

∙ 09/06/2023

Near-continuous time Reinforcement Learning for continuous state-action spaces

We consider the Reinforcement Learning problem of controlling an unknown...

0 Lorenzo Croissant, et al. ∙

research

∙ 01/06/2022

Jointly Efficient and Optimal Algorithms for Logistic Bandits

Logistic Bandits have recently undergone careful scrutiny by virtue of t...

5 Louis Faury, et al. ∙

research

∙ 03/09/2021

Regret Bounds for Generalized Linear Bandits under Parameter Drift

Generalized Linear Bandits (GLBs) are powerful extensions to the Linear ...

0 Louis Faury, et al. ∙

research

∙ 10/23/2020

Instance-Wise Minimax-Optimal Algorithms for Logistic Bandits

Logistic Bandits have recently attracted substantial attention, by provi...

0 Marc Abeille, et al. ∙

research

∙ 10/20/2020

Real-Time Optimisation for Online Learning in Auctions

In display advertising, a small group of sellers and bidders face each o...

0 Lorenzo Croissant, et al. ∙

research

∙ 07/13/2020

Efficient Optimistic Exploration in Linear-Quadratic Regulators via Lagrangian Relaxation

We study the exploration-exploitation dilemma in the linear quadratic re...

0 Marc Abeille, et al. ∙

research

∙ 02/18/2020

Improved Optimistic Algorithms for Logistic Bandits

The generalized linear bandit framework has attracted a lot of attention...

0 Louis Faury, et al. ∙

research

∙ 10/12/2019

Thompson Sampling in Non-Episodic Restless Bandits

Restless bandit problems assume time-varying reward distributions of the...

6 Young Hun Jung, et al. ∙

research

∙ 08/21/2018

Thresholding the virtual value: a simple method to increase welfare and lower reserve prices in online auction systems

Second price auctions with reserve price are widely used by the main Int...

0 Thomas Nedelec, et al. ∙

research

∙ 05/01/2018

Explicit shading strategies for repeated truthful auctions

With the increasing use of auctions in online advertising, there has bee...

0 Marc Abeille, et al. ∙

research

∙ 03/27/2017

Thompson Sampling for Linear-Quadratic Control Problems

We consider the exploration-exploitation tradeoff in linear quadratic (L...

0 Marc Abeille, et al. ∙

research

∙ 11/20/2016

Linear Thompson Sampling Revisited

We derive an alternative proof for the regret of Thompson sampling () in...

0 Marc Abeille, et al. ∙

Marc Abeille

Featured Co-authors

Sign in with Google

Consider DeepAI Pro