Trajectory Optimization of Flying Energy Sources using Q-Learning to Recharge Hotspot UAVs
Despite the increasing popularity of commercial usage of UAVs or drone-delivered services, their dependence on the limited-capacity on-board batteries hinders their flight-time and mission continuity. As such, developing in-situ power transfer solutions for topping-up UAV batteries have the potential to extend their mission duration. In this paper, we study a scenario where UAVs are deployed as base stations (UAV-BS) providing wireless Hotspot services to the ground nodes, while harvesting wireless energy from flying energy sources. These energy sources are specialized UAVs (Charger or transmitter UAVs, tUAVs), equipped with wireless power transmitting devices such as RF antennae. tUAVs have the flexibility to adjust their flight path to maximize energy transfer. With the increasing number of UAV-BSs and environmental complexity, it is necessary to develop an intelligent trajectory selection procedure for tUAVs so as to optimize the energy transfer gain. In this paper, we model the trajectory optimization of tUAVs as a Markov Decision Process (MDP) problem and solve it using Q-Learning algorithm. Simulation results confirm that the Q-Learning based optimized trajectory of the tUAVs outperforms two benchmark strategies, namely random path planning and static hovering of the tUAVs.
READ FULL TEXT