skbio.tree.majority_rule

skbio.tree.majority_rule(trees, weights=None, cutoff=0.5, support_attr='support', tree_node_class=<class 'skbio.tree._tree.TreeNode'>)[source]

Determines consensus trees from a list of rooted trees

State: Experimental as of 0.4.0.

Parameters:

trees : list of TreeNode

The trees to operate on

weights : list or np.array of {int, float}, optional

If provided, the list must be in index order with trees. Each tree will receive the corresponding weight. If omitted, all trees will be equally weighted.

cutoff : float, 0.0 <= cutoff <= 1.0, optional

Any clade that has <= cutoff support will be dropped. If cutoff is < 0.5, then it is possible that ties will result. If so, ties are broken arbitrarily depending on list sort order.

support_attr : str, optional

The attribute to be decorated onto the resulting trees that contain the consensus support.

tree_node_class : type, optional

Specifies type of consensus trees that are returned. Either TreeNode (the default) or a type that implements the same interface (most usefully, a subclass of TreeNode).

Returns:

list of tree_node_class instances

Each tree will be of type tree_node_class. Multiple trees can be returned in the case of two or more disjoint sets of tips represented on input.

Notes

This code was adapted from PyCogent’s majority consensus code originally written by Matthew Wakefield. The method is based off the original description of consensus trees in [R283]. An additional description can be found in the Phylip manual [R284]. This method does not support majority rule extended.

Support is computed as a weighted average of the tree weights in which the clade was observed in. For instance, if {A, B, C} was observed in 5 trees all with a weight of 1, its support would then be 5.

References

[R283](1, 2) Margush T, McMorris FR. (1981) “Consensus n-trees.” Bulletin for Mathematical Biology 43(2) 239-44.
[R284](1, 2) http://evolution.genetics.washington.edu/phylip/doc/consense.html

Examples

Computing the majority consensus, using the example from the Phylip manual with the exception that we are computing majority rule and not majority rule extended.

>>> from skbio.tree import TreeNode
>>> from io import StringIO
>>> trees = [
... TreeNode.read(StringIO("(A,(B,(H,(D,(J,(((G,E),(F,I)),C))))));")),
... TreeNode.read(StringIO("(A,(B,(D,((J,H),(((G,E),(F,I)),C)))));")),
... TreeNode.read(StringIO("(A,(B,(D,(H,(J,(((G,E),(F,I)),C))))));")),
... TreeNode.read(StringIO("(A,(B,(E,(G,((F,I),((J,(H,D)),C))))));")),
... TreeNode.read(StringIO("(A,(B,(E,(G,((F,I),(((J,H),D),C))))));")),
... TreeNode.read(StringIO("(A,(B,(E,((F,I),(G,((J,(H,D)),C))))));")),
... TreeNode.read(StringIO("(A,(B,(E,((F,I),(G,(((J,H),D),C))))));")),
... TreeNode.read(StringIO("(A,(B,(E,((G,(F,I)),((J,(H,D)),C)))));")),
... TreeNode.read(StringIO("(A,(B,(E,((G,(F,I)),(((J,H),D),C)))));"))]
>>> consensus = majority_rule(trees, cutoff=0.5)[0]
>>> for node in sorted(consensus.non_tips(),
...                    key=lambda k: k.count(tips=True)):
...     support_value = node.support
...     names = ' '.join(sorted(n.name for n in node.tips()))
...     print("Tips: %s, support: %s" % (names, support_value))
Tips: F I, support: 9.0
Tips: D H J, support: 6.0
Tips: C D H J, support: 6.0
Tips: C D F G H I J, support: 6.0
Tips: C D E F G H I J, support: 9.0
Tips: B C D E F G H I J, support: 9.0

In the next example, multiple trees will be returned which can happen if clades are not well supported across the trees. In addition, this can arise if not all tips are present across all trees.

>>> trees = [
...     TreeNode.read(StringIO("((a,b),(c,d),(e,f));")),
...     TreeNode.read(StringIO("(a,(c,d),b,(e,f));")),
...     TreeNode.read(StringIO("((c,d),(e,f),b);")),
...     TreeNode.read(StringIO("(a,(c,d),(e,f));"))]
>>> consensus_trees = majority_rule(trees)
>>> len(consensus_trees)
4