LORA: Learning to Optimize for Resource Allocation in Wireless Networks with Few Training Samples
Effective resource allocation plays a pivotal role for performance optimization in wireless networks. Unfortunately, typical resource allocation problems are mixed-integer nonlinear programming (MINLP) problems, which are NP-hard in general. Machine learning-based methods recently emerge as a disruptive way to obtain near-optimal performance for MINLP problems with affordable computational complexity. However, a key challenge is that these methods require huge amounts of training samples, which are difficult to obtain in practice. Furthermore, they suffer from severe performance deterioration when the network parameters change, which commonly happens and can be characterized as the task mismatch issue. In this paper, to address the sample complexity issue, instead of directly learning the input-output mapping of a particular resource allocation algorithm, we propose a Learning to Optimize framework for Resource Allocation, called LORA, that learns the pruning policy in the optimal branch-and-bound algorithm. By exploiting the algorithm structure, this framework enjoys an extremely low sample complexity, in the order of tens or hundreds, compared with millions for existing methods. To further address the task mismatch issue, we propose a transfer learning method via self-imitation, named LORA-TL, which can adapt to the new task with only a few additional unlabeled training samples. Numerical simulations demonstrate that LORA outperforms specialized state-of-art algorithms and achieves near-optimal performance. Moreover, LORA-TL, relying on a few unlabeled samples, achieves comparable performance with the model trained from scratch with sufficient labeled samples.
READ FULL TEXT