Calculate rooted Subtree Prune-and-Regraft (rSPR) distances between pairs of rooted binary trees using the algorithms of Whidden, Beiko & Zeh (2013).

RSPRDist(
  tree1,
  tree2 = NULL,
  allPairs = is.null(tree2),
  checks = TRUE,
  approx = FALSE,
  maf = FALSE
)

Arguments

tree1, tree2

Trees of class phylo, or lists thereof. All trees must be rooted and bear identical sets of tip labels.

allPairs

Logical; if TRUE, compare each tree in tree1 with each tree in tree2; if FALSE, compare corresponding pairs. Defaults to TRUE when tree2 is not supplied.

checks

Logical; validate tree labels and dimensions before computation. Set FALSE at your peril—improper input is likely to crash R.

approx

Logical; if TRUE return the linear-time 3-approximation instead of the exact distance.

maf

Logical; if TRUE return the maximum agreement forest alongside the exact distance (implies approx = FALSE).

Value

By default, an integer vector of rSPR distances (one per tree pair), or a dist object when allPairs = TRUE and tree2 = NULL.

If maf = TRUE, a named list with elements:

exact

Integer vector of rSPR distances.

maf_1, maf_2

Character vectors giving the maximum agreement forest for each pair of trees, expressed as space-separated Newick components.

Details

This function wraps the rspr C++ library by Chris Whidden. It handles rooted trees and is substantially more efficient than USPRDist() for large trees: it can handle pairs with 200+ leaves and distances > 50.

Note that these distances are NP-hard to compute exactly; running time scales as O(2^k n) where k is the rSPR distance and n is the number of leaves. The built-in cluster decomposition (enabled by default) provides a large practical speedup. Use approx = TRUE for a guaranteed linear-time 3-approximation.

Input trees must be rooted. An error is raised if any tree is unrooted.

References

Whidden C, Beiko RG, Zeh N (2013). Fixed-Parameter Algorithms for Maximum Agreement Forests. SIAM Journal on Computing 42:1431–1466. doi:10.1137/110845045

Whidden C, Zeh N, Beiko RG (2014). Supertrees based on the subtree prune-and-regraft distance. Systematic Biology 63:566–581. doi:10.1093/sysbio/syu023

See also

USPRDist() for unrooted trees.

Examples

library(ape)
set.seed(1)
tree1 <- rtree(8)
tree2 <- rtree(8)
tree1$tip.label <- tree2$tip.label <- paste0("t", 1:8)

RSPRDist(tree1, tree2)
#> [1] 2

# All pairwise distances among a list of trees
trees <- c(tree1, tree2)
RSPRDist(trees)
#>   1
#> 2 2

# Fast 3-approximation
RSPRDist(tree1, tree2, approx = TRUE)
#> [1] 2

# With maximum agreement forest
RSPRDist(tree1, tree2, maf = TRUE)
#> $exact
#> [1] 2
#> 
#> $maf_1
#> [1] "(((t3,t4),t5),((t6,t7),t8)) t2 t1 "
#> 
#> $maf_2
#> [1] "(((t3,t4),t5),((t6,t7),t8)) t2 t1 "
#>