Implementation and results of a 'Bullseye' test, after that proposed by Kuhner and Yamato (2015).
bullseyeTrees bullMoDiInferred bullMoDiScores bullseyeMorphInferred bullseyeMorphScores
bullseyeTrees
is a list with four elements, named 5 leaves
, 10 leaves
,
20 leaves
and 50 leaves
.
Each element contains 1 000 trees with n leaves, randomly sampled
(note: not from the uniform distribution) using ape::rtree()
.
The bullseyeMorph
prefix refers to the 'subsampling' experiment
described by Smith (2020); the bullMoDi
prefix refers to the
'miscoding' experiment.
bull...Inferred
is a list with four elements, named as in bullseyeTrees
.
Each element contains 1 000 sub-elements.
Each sub-element is a list of
ten trees, which have been inferred from progressively more degraded datasets,
originally simulated from the corresponding tree in bullseyeTrees
.
bull...Scores
is a list with four elements, named as in bullseyeTrees
.
Each element contains a three dimensional array, in which the first dimension
corresponds to the progressive degrees of degradation, labelled according to
the number of characters present or the percentage of tokens switched;
the second dimension is named with an abbreviation of the tree similarity /
distance metric used to score the trees (see 'Methods tested' below),
and the third dimension contains 1 000
entries corresponding to the trees in bullseyeTrees
.
Each cell contains the distance between the inferred tree and the generative
tree under the stated tree distance metric.
An object of class list
of length 4.
An object of class list
of length 4.
An object of class list
of length 4.
An object of class list
of length 4.
Scripts used to generate data objects are housed in the
data-raw
directory.
For analysis of this data, see the accompanying vignette.
pid
: Phylogenetic Information Distance (Smith 2020)
msid
: Matching Split Information Distance (Smith 2020)
cid
: Clustering Information Distance (Smith 2020)
qd
: Quartet divergence (Smith 2019)
nye
: Nye et al. tree distance (Nye et al. 2006)
jnc2
, jnc4
: Jaccard-Robinson-Foulds distances with k = 2, 4,
conflicting pairings prohibited ('no-conflict')
joc2
, jco4
: Jaccard-Robinson-Foulds distances with k = 2, 4,
conflicting pairings permitted ('conflict-ok')
ms
: Matching Split Distance (Bogdanowicz & Giaro 2012)
mast
: Size of Maximum Agreement Subtree (Valiente 2009)
masti
: Information content of Maximum Agreement Subtree
nni_l
, nni_t
, nni_u
: Lower
bound, tight upper bound, and upper bound
for nearest-neighbour interchange distance (Li et al. 1996)
spr
: Approximate SPR distance
tbr_l
, tbr_u
: Lower and upper bound for tree bisection and reconnection
(TBR) distance, calculated using
TBRDist
rf
: Robinson-Foulds distance (Robinson & Foulds 1981)
icrf
: Information-corrected Robinson-Foulds distance (Smith 2020)
path
: Path distance (Steel & Penny 1993)
Kuhner MK, Yamato J (2015). “Practical performance of tree comparison metrics.” Systematic Biology, 64(2), 205--214. doi: 10.1093/sysbio/syu085 .
Bogdanowicz D, Giaro K (2012). “Matching split distance for unrooted binary phylogenetic trees.” IEEE/ACM Transactions on Computational Biology and Bioinformatics, 9(1), 150--160. doi: 10.1109/TCBB.2011.48 .
Li M, Tromp J, Zhang L (1996). “Some notes on the nearest neighbour interchange distance.” In Goos G, Hartmanis J, Leeuwen J, Cai J, Wong CK (eds.), Computing and Combinatorics, volume 1090, 343--351. Springer, Berlin, Heidelberg. ISBN 978-3-540-61332-9 978-3-540-68461-9, doi: 10.1007/3-540-61332-3_168 .
Kendall M, Colijn C (2016). “Mapping phylogenetic trees to reveal distinct patterns of evolution.” Molecular Biology and Evolution, 33(10), 2735--2743. doi: 10.1093/molbev/msw124 .
Nye TMW, Liò P, Gilks WR (2006). “A novel algorithm and web-based tool for comparing two alternative phylogenetic trees.” Bioinformatics, 22(1), 117--119. doi: 10.1093/bioinformatics/bti720 .
Robinson DF, Foulds LR (1981). “Comparison of phylogenetic trees.” Mathematical Biosciences, 53(1-2), 131--147. doi: 10.1016/0025-5564(81)90043-2 .
Smith MR (2019). “Bayesian and parsimony approaches reconstruct informative trees from simulated morphological datasets.” Biology Letters, 15, 20180632. doi: 10.1098/rsbl.2018.0632 .
Smith MR (2020). “Information theoretic Generalized Robinson-Foulds metrics for comparing phylogenetic trees.” Bioinformatics, online ahead of print. doi: 10.1093/bioinformatics/btaa614 .
Steel MA, Penny D (1993). “Distributions of tree comparison metrics---some new results.” Systematic Biology, 42(2), 126--141. doi: 10.1093/sysbio/42.2.126 .
Valiente G (2009). Combinatorial Pattern Matching Algorithms in Computational Biology using Perl and R, CRC Mathematical and Computing Biology Series. CRC Press, Boca Raton.