Near Optimal Adversarial Attack on UCB Bandits

08/21/2020
by   Shiliang Zuo, et al.
0

We consider a stochastic multi-arm bandit problem where rewards are subject to adversarial corruption. We propose a novel attack strategy that manipulates a UCB principle into pulling some non-optimal target arm T - o(T) times with a cumulative cost that scales as √(log T), where T is the number of rounds. We also prove the first lower bound on the cumulative attack cost. Our lower bound matches our upper bound up to loglog T factors, showing our attack to be near optimal.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/30/2023

Adversarial Attacks on Adversarial Bandits

We study a security threat to adversarial multi-armed bandits, in which ...
research
04/21/2018

Gradient Masking Causes CLEVER to Overestimate Adversarial Perturbation Size

A key problem in research on adversarial examples is that vulnerability ...
research
08/02/2023

Certified Multi-Fidelity Zeroth-Order Optimization

We consider the problem of multi-fidelity zeroth-order optimization, whe...
research
07/07/2020

Stochastic Linear Bandits Robust to Adversarial Attacks

We consider a stochastic linear bandit problem in which the rewards are ...
research
12/31/2019

Near-Optimal Schedules for Simultaneous Multicasts

We study the store-and-forward packet routing problem for simultaneous m...
research
01/12/2022

Best Arm Identification with a Fixed Budget under a Small Gap

We consider the fixed-budget best arm identification problem in the mult...
research
11/13/2022

Generalizing distribution of partial rewards for multi-armed bandits with temporally-partitioned rewards

We investigate the Multi-Armed Bandit problem with Temporally-Partitione...

Please sign up or login with your details

Forgot password? Click here to reset