MaxMin implements algorithms that select a dispersed subsample of points that maximizes coverage of a larger set, under one of two objectives:
The Max-Min Diversity Problem (MMDP, the discrete p-dispersion objective) selects \(k\) elements such that the minimum distance between any pair of selected elements is as large as possible; the chosen elements are maximally separated.
The discrete k-centre problem selects \(k\) elements such that the maximum distance from any element in the original set to a selected element is as small as possible.
Solvers
The greedy farthest-first heuristic FarFirst() serves as a quick approximate solver to both problems, providing a solution that is guaranteed to be within a factor of two of the optimum (and typically much closer).
Better approximations can be accomplished, at the cost of longer runs, by specialized solvers.
MMDP (max-min dispersion)
| Function | Method | Use |
|---|---|---|
DropAdd() |
DropAdd tabu search (Porumbel et al. 2011) | ~99%-optimal heuristic |
Grasp() |
GRASP with path-relinking metaheuristic (Resende et al. 2010) | Slower but powerful heuristic |
ExactMaxMin() |
Node-packing integer program (Sayyady & Fathi 2016) | Proven optimum, small k (needs highs) |
k-centre (min-max covering)
| Function | Method | Use |
|---|---|---|
KCentre() |
CDSh heuristic (García-Díaz et al. 2017, 2019) | ~1–3.5% of optimum at \(O(N^2 \log N)\), typically far tighter than FarFirst()
|
ExactKCentre() |
Min-cover integer program | Proven optimum, small k (needs highs) |
Solvers support precomputed distance matrices (dist objects), matrices of Euclidian coordinates, or lists of elements from which distances can be calculated.
MinDist() returns the minimum pairwise distance within a selection (the MMDP objective) and KCentreRadius() returns its covering radius (the k-centre objective).
Related problems
MaxMin selects a subset from a given set of elements. Several established packages solve neighbouring objectives:
-
k-medoids / k-median selects elements that minimize the mean distance from each element to its nearest centre. Implementations include:
-
cluster::pam(): generates the full O(N²) dissimilarity matrix, and hence caps at n ≤ 65 536; -
banditpam, a matrix-free \(O(N \log N)\) implementation restricted to coordinate data; -
cluster::clara()(PAM / FastPAM / FasterPAM); -
ClusterR::Cluster_Medoids().
-
k-means (
stats::kmeans()) selects elements so as to minimize the within-cluster sum of squares around centres that are coordinate means, not data points; as such, it applies only to Euclidean coordinates. k-means++ (TreeDist::KMeansPP()) initializes its selection using D²-weighted seeding, a randomized relative ofFarFirst()’s farthest-first traversal.maximinsolves the related design problem of adding new points at positions that maximize the minimum inter-point distance.