random order
-
Streaming Submodular Maximization under Matroid ConstraintsMoran Feldman, Paul Liu, Ashkan Norouzi-Fard, Ola Svensson, Jan Vondrák, and Rico Zenklusen
[pdf]Recent progress in (semi-)streaming algorithms for monotone submodular function maximization has led to tight results for a simple cardinality constraint. However, current techniques fail to give a similar understanding for natural generalizations, including matroid constraints. Our paper improves on the previously best approximation guarantees of $1/4$ and $1/2$ for single-pass and multi-pass streaming algorithms, respectively. In fact, our multi-pass streaming algorithm is tight in that any algorithm with a better guarantee than $1/2$ must make several passes through the stream and any algorithm that beats our guarantee of $1-1/e$ must make linearly many passes (as well as an exponential number of value oracle queries).
-
Cardinality constrained submodular maximization for random streamsPaul Liu, Aviad Rubinstein, Jan Vondrák, Junyao Zhao
[pdf] [code] [poster] [slides]
We consider the problem of maximizing submodular functions in single-pass streaming and secretaries-with-shortlists models, both with random arrival order. For cardinality constrained monotone functions, Agrawal, Shadravan, and Stein gave a single-pass $(1-1/e-\varepsilon)$-approximation algorithm using only linear memory, but their exponential dependence on $\varepsilon$ makes it impractical even for $\varepsilon=0.1$. We simplify both the algorithm and the analysis, obtaining an exponential improvement in the $\varepsilon$-dependence (in particular, $O(k/\varepsilon)$ memory). Extending these techniques, we also give a simple $(1/e-\varepsilon)$-approximation for non-monotone functions in $O(k/\varepsilon)$ memory. For the monotone case, we also give a corresponding unconditional hardness barrier of $1-1/e+\varepsilon$ for single-pass algorithms in randomly ordered streams, even assuming unlimited computation.
Finally, we show that the algorithms are simple to implement and work well on real world datasets.