A Multi-Armed Bandit-based Approach to Mobile Network Provider Selection
We argue for giving users the ability to lease bandwidth temporarily from any mobile network operator. We propose, prototype, and evaluate a spectrum market for mobile network access, where multiple network operators offer blocks of bandwidth at specified prices for short-term leases to users, with autonomous agents on user devices making purchase decisions by trading off price, performance, and budget constraints. We show that the problem of provider selection can be formulated as a so-called Bandit problem. For the case where providers change prices synchronously, we approach the problem through contextual multi-armed bandits and Reinforcement Learning methods like Q-learning either applied directly to the bandit maximization problem or indirectly to approximate the Gittins indices that are known to yield the optimal provider selection policy. For a simulated scenario corresponding to a practical use case, our agent shows a 20-41% QoE improvement over random provider selection under various demand, price and mobility conditions. We implemented a prototype spectrum market using LTE networks and eSIM techology and deployed it on a testbed, using a blockchain to implement the ledger where bandwidth purchase transactions are recorded. Experiments showed that we can learn both user behavior and network performance efficiently, and recorded 25-74% improvements in QoE under various competing agent scenarios.
READ FULL TEXT