An Online Convex Optimization Approach to Dynamic Network Resource Allocation
Existing approaches to online convex optimization (OCO) make sequential one-slot-ahead decisions, which lead to (possibly adversarial) losses that drive subsequent decision iterates. Their performance is evaluated by the so-called regret that measures the difference of losses between the online solution and the best yet fixed overall solution in hindsight. The present paper deals with online convex optimization involving adversarial loss functions and adversarial constraints, where the constraints are revealed after making decisions, and can be tolerable to instantaneous violations but must be satisfied in the long term. Performance of an online algorithm in this setting is assessed by: i) the difference of its losses relative to the best dynamic solution with one-slot-ahead information of the loss function and the constraint (that is here termed dynamic regret); and, ii) the accumulated amount of constraint violations (that is here termed dynamic fit). In this context, a modified online saddle-point (MOSP) scheme is developed, and proved to simultaneously yield sub-linear dynamic regret and fit, provided that the accumulated variations of per-slot minimizers and constraints are sub-linearly growing with time. MOSP is also applied to the dynamic network resource allocation task, and it is compared with the well-known stochastic dual gradient method. Under various scenarios, numerical experiments demonstrate the performance gain of MOSP relative to the state-of-the-art.
READ FULL TEXT