Learning to Price Vehicle Service with Unknown Demand
It can be profitable for vehicle service providers to set service prices based on users' travel demand on different origin-destination pairs. The prior studies on the spatial pricing of vehicle service rely on the assumption that providers know users' demand. In this paper, we study a monopolistic provider who initially does not know users' demand and needs to learn it over time by observing the users' responses to the service prices. We design a pricing and vehicle supply policy, considering the tradeoff between exploration (i.e., learning the demand) and exploitation (i.e., maximizing the provider's short-term payoff). Considering that the provider needs to ensure the vehicle flow balance at each location, its pricing and supply decisions for different origin-destination pairs are tightly coupled. This makes it challenging to theoretically analyze the performance of our policy. We analyze the gap between the provider's expected time-average payoffs under our policy and a clairvoyant policy, which makes decisions based on complete information of the demand. We prove that after running our policy for D days, the loss in the expected time-average payoff can be at most O((ln D)^0.5 D^(-0.25)), which decays to zero as D approaches infinity.
READ FULL TEXT