“Rogue” implements approaches to identify rogue taxa in phylogenetic analysis. Rogues are wildcard leaves whose uncertain position reduces the resolution of consensus trees. Consensus trees that omit rogue taxa can be more informative.
“Rogue” allows the user to select a concept of “information” by which the quality of consensus trees should be evaluated, and a heuristic approach by which rogue taxa should be identified.
Rogue detection using the phylogenetic and clustering information content measures (Smith, 2022) is implemented using a quick heuristic that drops the least “stable” leaves one at a time, using an ad hoc definition of stability (Smith, 2022); and by a more exhaustive (and time-consuming) approach that considers dropping all possible sets of up to n leaves (Aberer et al. 2013).
The latter heuristic is implemented for the relative bipartition “information” content and Pattengale’s criterion via RogueNaRok (Aberer et al. 2013).
Install and load the stable version from CRAN as normal:
Install the development version from GitHub with
devtools::install_github("ms609/Rogue", args="--recursive"). (Requires git to be installed and added to your PATH system environment variable; you may also require the “curl” R package.)
If you find this package useful in your work, please consider citing Smith (2021).
To cite the underlying methods, please cite Aberer et al. (2013) (‘RogueNaRok’) or Smith (2022), as appropriate.
A.J. Aberer, D. Krompass, A. Stamatakis (2013): Pruning rogue taxa improves phylogenetic accuracy: an efficient algorithm and webservice. Systematic Biology 62(1): 162-166, doi:10.1093/sysbio/sys078.
M.R. Smith (2021): Rogue: Identify rogue taxa in sets of phylogenetic trees. Zenodo, doi:10.5281/zenodo.5037327.
M.R. Smith (2022): Using information theory to detect rogue taxa and improve consensus trees. Systematic Biology, syab099, doi:10.1093/sysbio/syab099