Calculate the single binary tree that represents the geometric median -- an 'average' -- of a forest of tree topologies.

# S3 method for multiPhylo
median(
  x,
  na.rm = FALSE,
  Distance = ClusteringInfoDistance,
  index = FALSE,
  breakTies = TRUE,
  ...
)

Arguments

x

Object of class multiPhylo containing phylogenetic trees.

na.rm, ...

Unused; included for consistency with default function..

Distance

Function to calculate distances between each pair of trees in x.

index

Logical: if TRUE, return the index of the median tree(s); if FALSE, return the tree itself.

breakTies

Logical: if TRUE, return a single tree with the minimum score; if FALSE, return all tied trees.

Value

median() returns an object of class phylo corresponding to the geometric median of a set of trees: that is, the tree whose average distance from all other trees in the set is lowest. If multiple trees tie in their average distance, the first will be returned, unless breakTies = FALSE, in which case an object of class multiPhylo containing all such trees will be returned.

Details

The geometric median is the tree that exhibits the shortest average distance from each other tree topology in the set. It represents an 'average' of a set of trees, though note that an unsampled tree may be closer to the geometric 'centre of gravity' of the input set -- such a tree would not be considered.

The result will depend on the metric chosen to calculate distances between tree topologies. In the absence of a natural metric of tree topologies, the default choice is ClusteringInfoDistance() -- which discards branch length information. If specifying a different function, be sure that it returns a difference, rather than a similarity.

See also

Consensus methods: ape::consensus(), TreeTools::ConsensusWithout()

Other tree space functions: MapTrees(), MappingQuality(), SpectralEigens()

Author

Martin R. Smith (martin.smith@durham.ac.uk)

Examples

library('TreeTools', quietly = TRUE, warn.conflicts = FALSE) tenTrees <- as.phylo(1:10, nTip = 8) # Default settings: median(tenTrees)
#> #> Phylogenetic tree with 8 tips and 7 internal nodes. #> #> Tip labels: #> t1, t2, t3, t4, t5, t6, ... #> #> Rooted; no branch lengths.
# Robinson-Foulds distances include ties: median(tenTrees, Distance = RobinsonFoulds, breakTies = FALSE)
#> 4 phylogenetic trees
# Be sure to use a distance function, rather than a similarity: NyeDistance <- function (...) NyeSimilarity(..., similarity = FALSE) median(tenTrees, Distance = NyeDistance)
#> #> Phylogenetic tree with 8 tips and 7 internal nodes. #> #> Tip labels: #> t1, t2, t3, t4, t5, t6, ... #> #> Rooted; no branch lengths.
# To analyse a list of trees that is not of class multiPhylo: treeList <- lapply(1:10, as.phylo, nTip = 8) class(treeList)
#> [1] "list"
median(structure(treeList, class = 'multiPhylo'))
#> #> Phylogenetic tree with 8 tips and 7 internal nodes. #> #> Tip labels: #> t1, t2, t3, t4, t5, t6, ... #> #> Rooted; no branch lengths.