Multi-player Multi-Armed Bandits with non-zero rewards on collisions for uncoordinated spectrum access

10/21/2019
by   Akshayaa Magesh, et al.
0

In this paper, we study the uncoordinated spectrum access problem using the multi-player multi-armed bandits framework. We consider a model where there is no central control and the users cannot communicate with each other. The environment may appear differently to different users, i.e., the mean rewards as seen by different users for a particular channel may be different. Additionally, in case of a collision, we allow for the colliding users to receive non-zero rewards. With this setup, we present a policy that achieves expected regret of order O(log^2+δT) for some δ > 0.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset