Maximum-Minimum Diversity and Discrete k-Centre Subset Selection • MaxMin

MaxMin implements algorithms that select a dispersed subsample of points that maximizes coverage of a larger set, under one of two objectives:

The Max-Min Diversity Problem (MMDP, the discrete p-dispersion objective) selects \(k\) elements such that the minimum distance between any pair of selected elements is as large as possible; the chosen elements are maximally separated.

The discrete k-centre problem selects \(k\) elements such that the maximum distance from any element in the original set to a selected element is as small as possible.

Solvers

The greedy farthest-first heuristic FarFirst() serves as a quick approximate solver to both problems, providing a solution that is guaranteed to be within a factor of two of the optimum (and typically much closer).

Better approximations can be accomplished, at the cost of longer runs, by specialized solvers.

MMDP (max-min dispersion)

Function	Method	Use
`DropAdd()`	DropAdd tabu search (Porumbel et al. 2011)	~99%-optimal heuristic
`Grasp()`	GRASP with path-relinking metaheuristic (Resende et al. 2010)	Slower but powerful heuristic
`ExactMaxMin()`	Node-packing integer program (Sayyady & Fathi 2016)	Proven optimum, small `k` (needs `highs`)

k-centre (min-max covering)

Function	Method	Use
`KCentre()`	CDSh heuristic (García-Díaz et al. 2017, 2019)	~1–3.5% of optimum at \(O(N^2 \log N)\), typically far tighter than `FarFirst()`
`ExactKCentre()`	Min-cover integer program	Proven optimum, small `k` (needs `highs`)

Solvers support precomputed distance matrices (dist objects), matrices of Euclidian coordinates, or lists of elements from which distances can be calculated.

MinDist() returns the minimum pairwise distance within a selection (the MMDP objective) and KCentreRadius() returns its covering radius (the k-centre objective).

Installation

# install.packages("remotes")
remotes::install_github("ms609/MaxMin")

MaxMin selects a subset from a given set of elements. Several established packages solve neighbouring objectives:

k-medoids / k-median selects elements that minimize the mean distance from each element to its nearest centre. Implementations include:
- cluster::pam(): generates the full O(N²) dissimilarity matrix, and hence caps at n ≤ 65 536;
- banditpam, a matrix-free \(O(N \log N)\) implementation restricted to coordinate data;
- cluster::clara() (PAM / FastPAM / FasterPAM);
- ClusterR::Cluster_Medoids().
k-means (stats::kmeans()) selects elements so as to minimize the within-cluster sum of squares around centres that are coordinate means, not data points; as such, it applies only to Euclidean coordinates. k-means++ (TreeDist::KMeansPP()) initializes its selection using D²-weighted seeding, a randomized relative of FarFirst()’s farthest-first traversal.
maximin solves the related design problem of adding new points at positions that maximize the minimum inter-point distance.

MaxMin

Solvers

MMDP (max-min dispersion)

k-centre (min-max covering)

Installation

Related problems