skbio.tree.
majority_rule
(trees, weights=None, cutoff=0.5, support_attr='support')[source]¶Determines consensus trees from a list of rooted trees
State: Experimental as of 0.4.0.
Parameters: | trees : list of TreeNode
weights : list or np.array of {int, float}, optional
cutoff : float, 0.0 <= cutoff <= 1.0
support_attr : str
|
---|---|
Returns: | list of TreeNode
|
Notes
This code was adapted from PyCogent’s majority consensus code originally written by Matthew Wakefield. The method is based off the original description of consensus trees in [R279]. An additional description can be found in the Phylip manual [R280]. This method does not support majority rule extended.
Support is computed as a weighted average of the tree weights in which the clade was observed in. For instance, if {A, B, C} was observed in 5 trees all with a weight of 1, its support would then be 5.
References
[R279] | (1, 2) Margush T, McMorris FR. (1981) “Consensus n-trees.” Bulletin for Mathematical Biology 43(2) 239-44. |
[R280] | (1, 2) http://evolution.genetics.washington.edu/phylip/doc/consense.html |
Examples
Computing the majority consensus, using the example from the Phylip manual with the exception that we are computing majority rule and not majority rule extended.
>>> from skbio.tree import TreeNode
>>> from io import StringIO
>>> trees = [
... TreeNode.read(StringIO("(A,(B,(H,(D,(J,(((G,E),(F,I)),C))))));")),
... TreeNode.read(StringIO("(A,(B,(D,((J,H),(((G,E),(F,I)),C)))));")),
... TreeNode.read(StringIO("(A,(B,(D,(H,(J,(((G,E),(F,I)),C))))));")),
... TreeNode.read(StringIO("(A,(B,(E,(G,((F,I),((J,(H,D)),C))))));")),
... TreeNode.read(StringIO("(A,(B,(E,(G,((F,I),(((J,H),D),C))))));")),
... TreeNode.read(StringIO("(A,(B,(E,((F,I),(G,((J,(H,D)),C))))));")),
... TreeNode.read(StringIO("(A,(B,(E,((F,I),(G,(((J,H),D),C))))));")),
... TreeNode.read(StringIO("(A,(B,(E,((G,(F,I)),((J,(H,D)),C)))));")),
... TreeNode.read(StringIO("(A,(B,(E,((G,(F,I)),(((J,H),D),C)))));"))]
>>> consensus = majority_rule(trees, cutoff=0.5)[0]
>>> for node in sorted(consensus.non_tips(),
... key=lambda k: k.count(tips=True)):
... support_value = node.support
... names = ' '.join(sorted(n.name for n in node.tips()))
... print("Tips: %s, support: %s" % (names, support_value))
Tips: F I, support: 9.0
Tips: D H J, support: 6.0
Tips: C D H J, support: 6.0
Tips: C D F G H I J, support: 6.0
Tips: C D E F G H I J, support: 9.0
Tips: B C D E F G H I J, support: 9.0
In the next example, multiple trees will be returned which can happen if clades are not well supported across the trees. In addition, this can arise if not all tips are present across all trees.
>>> trees = [
... TreeNode.read(StringIO("((a,b),(c,d),(e,f));")),
... TreeNode.read(StringIO("(a,(c,d),b,(e,f));")),
... TreeNode.read(StringIO("((c,d),(e,f),b);")),
... TreeNode.read(StringIO("(a,(c,d),(e,f));"))]
>>> consensus_trees = majority_rule(trees)
>>> len(consensus_trees)
4