Streaming Algorithms for Online Selection Problems
The model of streaming algorithms is motivated by the increasingly common situation in which the sheer amount of available data limits the ways in which the data can be accessed. Streaming algorithms are typically allowed a single pass over the data and can only store a sublinear fraction of the data at any time. We initiate the study of classic online selection problems in a streaming model where the data stream consists of two parts: historical data points that an algorithm can use to learn something about the input; and data points from which a selection can be made. Both types of data points are i.i.d. draws from an unknown distribution. We consider the two canonical objectives for online selection—maximizing the probability of selecting the maximum and maximizing the expected value of the selection—and provide the first performance guarantees for both these objectives in the streaming model.
READ FULL TEXT