Title: | Greedy Set Cover |
---|---|
Description: | A fast implementation of the greedy algorithm for the set cover problem using 'Rcpp'. |
Authors: | Matthias Kaeding [aut, cre] |
Maintainer: | Matthias Kaeding <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.0 |
Built: | 2025-03-10 05:19:49 UTC |
Source: | https://github.com/matthiaskaeding/rcppgreedysetcover |
This package offers an implementation of the greedy algorithm for the set cover problem using Rcpp.
We are given a universe of elements U and F, a collection of subsets from U, covering U; i.e. U is in the union of the sets in F. The objective is to find A, the smallest subcollection of F, covering U. An important application is hospital placement, where the number of hospitals is minimized under the constraint that all residents are provided for.
The optimal solution to the problem is available via linear programming, however this is not a feasible solution for large problems due to the computational demands involved. A quick approximate solution is given by the greedy algorithm. The algorithm iterates the following steps until all elements are covered, starting from an empty A:
Add the largest set of uncovered elements to A.
Remove covered elements from F.
This simple algorithm exhibits surprisingly good properties. For a nice introduction to the set cover problem and the greedy algorithm see Vazirani, 2001.
Vazirani, Vijay V. (2001), Approximation Algorithms, Springer
Fast greedy set cover algorithm.
greedySetCover(X, data.table = TRUE)
greedySetCover(X, data.table = TRUE)
X |
Two-column data.frame in long format: Column 1 identifies the sets, column 2 the elements. |
data.table |
If |
If data.table == TRUE
a data.table
, keyed by sets and elements.
Else a data.frame
, sorted by sets and elements.
Column names are derived from input.
# Create some data. set.seed(333) X <- data.table::rbindlist( lapply( seq_len(1e4L), function(x) list(element=sample.int(n=1e3L,size=sample.int(50L,1L))) ), idcol="set" ) # Elements are integers 1,2,...,1000. # Run set cover res <- greedySetCover(X,FALSE) head(res) # Check if all elements are covered. identical(sort(unique(res$element)),sort(unique(X$element)))
# Create some data. set.seed(333) X <- data.table::rbindlist( lapply( seq_len(1e4L), function(x) list(element=sample.int(n=1e3L,size=sample.int(50L,1L))) ), idcol="set" ) # Elements are integers 1,2,...,1000. # Run set cover res <- greedySetCover(X,FALSE) head(res) # Check if all elements are covered. identical(sort(unique(res$element)),sort(unique(X$element)))