Calculate rooted Subtree Prune-and-Regraft (rSPR) distances between pairs of rooted binary trees using the algorithms of Whidden, Beiko & Zeh (2013).
RSPRDist(
tree1,
tree2 = NULL,
allPairs = is.null(tree2),
checks = TRUE,
approx = FALSE,
maf = FALSE
)Trees of class phylo, or lists thereof.
All trees must be rooted and bear identical sets of tip labels.
Logical; if TRUE, compare each tree in tree1
with each tree in tree2; if FALSE, compare corresponding
pairs. Defaults to TRUE when tree2 is not supplied.
Logical; validate tree labels and dimensions before
computation. Set FALSE at your peril—improper input is likely
to crash R.
Logical; if TRUE return the linear-time
3-approximation instead of the exact distance.
Logical; if TRUE return the maximum agreement forest
alongside the exact distance (implies approx = FALSE).
By default, an integer vector of rSPR distances (one per tree pair), or a
dist object when allPairs = TRUE and tree2 = NULL.
If maf = TRUE, a named list with elements:
exactInteger vector of rSPR distances.
maf_1, maf_2Character vectors giving the maximum agreement forest for each pair of trees, expressed as space-separated Newick components.
This function wraps the rspr C++ library by Chris Whidden. It handles
rooted trees and is substantially more efficient than
USPRDist() for large trees: it can handle pairs with 200+
leaves and distances > 50.
Note that these distances are NP-hard to compute exactly; running time
scales as O(2^k n) where k is the rSPR distance and n is the number of
leaves. The built-in cluster decomposition (enabled by default) provides a
large practical speedup. Use approx = TRUE for a guaranteed
linear-time 3-approximation.
Input trees must be rooted. An error is raised if any tree is unrooted.
Whidden C, Beiko RG, Zeh N (2013). Fixed-Parameter Algorithms for Maximum Agreement Forests. SIAM Journal on Computing 42:1431–1466. doi:10.1137/110845045
Whidden C, Zeh N, Beiko RG (2014). Supertrees based on the subtree prune-and-regraft distance. Systematic Biology 63:566–581. doi:10.1093/sysbio/syu023
USPRDist() for unrooted trees.
library(ape)
set.seed(1)
tree1 <- rtree(8)
tree2 <- rtree(8)
tree1$tip.label <- tree2$tip.label <- paste0("t", 1:8)
RSPRDist(tree1, tree2)
#> [1] 2
# All pairwise distances among a list of trees
trees <- c(tree1, tree2)
RSPRDist(trees)
#> 1
#> 2 2
# Fast 3-approximation
RSPRDist(tree1, tree2, approx = TRUE)
#> [1] 2
# With maximum agreement forest
RSPRDist(tree1, tree2, maf = TRUE)
#> $exact
#> [1] 2
#>
#> $maf_1
#> [1] "(((t3,t4),t5),((t6,t7),t8)) t2 t1 "
#>
#> $maf_2
#> [1] "(((t3,t4),t5),((t6,t7),t8)) t2 t1 "
#>