Tight Bounds for Single-Pass Streaming Complexity of the Set Cover Problem

Authors: Sepehr Assadi, Sanjeev Khanna, Yang Li.
Conference: The 48th Annual Symposium on the Theory of Computing (STOC'16).
This paper resolves an open question raised by Indyk et al in [IMV'15].
Abstract: We resolve the space complexity of single-pass streaming algorithms for approximating the classic set cover problem. For finding an α-approximate set cover (for any α = o(sqrt{n})) using a single-pass streaming algorithm, we show that Θ(mn/α) space is both sufficient and necessary (up to an O(log n) factor); here m denotes the number of the sets and n denotes the size of the universe. This provides a strong negative answer to the open question posed by Indyk et al. regarding the possibility of having a single-pass algorithm with a small approximation factor that uses sub-linear space.

We further study the problem of estimating the size of a minimum set cover (instead of finding the actual sets), and establish that an additional factor of α saving in the space is achievable in this case and that this is the best possible. In other words, Θ(mn/α^2) space is both sufficient and necessary (up to logarithmic factors) for estimating the size of a minimum set cover to within a factor of α. Our algorithm in fact works for the more general problem of estimating the optimal value of a covering integer program. On the other hand, our lower bound holds even for set cover instances where the sets are presented in a random order.
Conference version: [PDF]
Full version: [arXiv]
BibTex: [DBLP]