Package 'RcppGreedySetCover'

Title: Greedy Set Cover
Description: A fast implementation of the greedy algorithm for the set cover problem using 'Rcpp'.
Authors: Matthias Kaeding [aut, cre]
Maintainer: Matthias Kaeding <[email protected]>
License: MIT + file LICENSE
Version: 0.1.0
Built: 2025-03-10 05:19:49 UTC
Source: https://github.com/matthiaskaeding/rcppgreedysetcover

Help Index


Fast Greedy Set Cover

Description

This package offers an implementation of the greedy algorithm for the set cover problem using Rcpp.

Set Cover Problem

We are given a universe of elements U and F, a collection of subsets from U, covering U; i.e. U is in the union of the sets in F. The objective is to find A, the smallest subcollection of F, covering U. An important application is hospital placement, where the number of hospitals is minimized under the constraint that all residents are provided for.

The optimal solution to the problem is available via linear programming, however this is not a feasible solution for large problems due to the computational demands involved. A quick approximate solution is given by the greedy algorithm. The algorithm iterates the following steps until all elements are covered, starting from an empty A:

  • Add the largest set of uncovered elements to A.

  • Remove covered elements from F.

This simple algorithm exhibits surprisingly good properties. For a nice introduction to the set cover problem and the greedy algorithm see Vazirani, 2001.

References

Vazirani, Vijay V. (2001), Approximation Algorithms, Springer


Greedy Set Cover

Description

Fast greedy set cover algorithm.

Usage

greedySetCover(X, data.table = TRUE)

Arguments

X

Two-column data.frame in long format: Column 1 identifies the sets, column 2 the elements.

data.table

If TRUE returns a data.table with keys given by sets and elements. If FALSE returns a data.frame, sorted by sets and elements.

Value

If data.table == TRUE a data.table, keyed by sets and elements. Else a data.frame, sorted by sets and elements. Column names are derived from input.

Examples

# Create some data.
set.seed(333)
X <- data.table::rbindlist(
  lapply(
    seq_len(1e4L),
    function(x) list(element=sample.int(n=1e3L,size=sample.int(50L,1L)))
  ),
  idcol="set"
)
# Elements are integers 1,2,...,1000.

# Run set cover
res <- greedySetCover(X,FALSE)
head(res)

# Check if all elements are covered.
identical(sort(unique(res$element)),sort(unique(X$element)))