Abstract:
We resolve the space complexity of single-pass streaming algorithms for approximating the classic set cover problem.
For finding an α-approximate set cover (for any α = o(sqrt{n})) using a single-pass streaming algorithm, we show
that Θ(mn/α) space is both sufficient and necessary (up to an O(log n) factor); here m denotes the number of the sets and n denotes the size of the universe.
This provides a strong negative answer to the open question posed by Indyk et al. regarding the possibility of having a single-pass algorithm with a small approximation factor that
uses sub-linear space.

We further study the problem of estimating the size of a minimum set cover (instead of finding the actual sets), and establish that an additional factor of α saving in the space
is achievable in this case and that this is the best possible. In other words, Θ(mn/α^2) space is both sufficient and necessary (up to logarithmic factors) for estimating
the size of a minimum set cover to within a factor of α. Our algorithm in fact works for the more general problem of estimating the optimal value of a covering integer program.
On the other hand, our lower bound holds even for set cover instances where the sets are presented in a random order.

Conference version:
[PDF]