skbio.tree.majority_rule

skbio.tree.majority_rule(trees, weights=None, cutoff=0.5, support_attr='support', tree_node_class=<class 'skbio.tree._tree.TreeNode'>)[source]

Determines consensus trees from a list of rooted trees

State: Experimental as of 0.4.0.

Parameters:
  • trees (list of TreeNode) – The trees to operate on

  • weights (list or np.array of {int, float}, optional) – If provided, the list must be in index order with trees. Each tree will receive the corresponding weight. If omitted, all trees will be equally weighted.

  • cutoff (float, 0.0 <= cutoff <= 1.0, optional) – Any clade that has <= cutoff support will be dropped. If cutoff is < 0.5, then it is possible that ties will result. If so, ties are broken arbitrarily depending on list sort order.

  • support_attr (str, optional) – The attribute to be decorated onto the resulting trees that contain the consensus support.

  • tree_node_class (type, optional) – Specifies type of consensus trees that are returned. Either TreeNode (the default) or a type that implements the same interface (most usefully, a subclass of TreeNode).

Returns:

Each tree will be of type tree_node_class. Multiple trees can be returned in the case of two or more disjoint sets of tips represented on input.

Return type:

list of tree_node_class instances

Notes

This code was adapted from PyCogent’s majority consensus code originally written by Matthew Wakefield. The method is based off the original description of consensus trees in [1]. An additional description can be found in the Phylip manual [2]. This method does not support majority rule extended.

Support is computed as a weighted average of the tree weights in which the clade was observed in. For instance, if {A, B, C} was observed in 5 trees all with a weight of 1, its support would then be 5.

References

Examples

Computing the majority consensus, using the example from the Phylip manual with the exception that we are computing majority rule and not majority rule extended.

>>> from skbio.tree import TreeNode
>>> from io import StringIO
>>> trees = [
... TreeNode.read(StringIO("(A,(B,(H,(D,(J,(((G,E),(F,I)),C))))));")),
... TreeNode.read(StringIO("(A,(B,(D,((J,H),(((G,E),(F,I)),C)))));")),
... TreeNode.read(StringIO("(A,(B,(D,(H,(J,(((G,E),(F,I)),C))))));")),
... TreeNode.read(StringIO("(A,(B,(E,(G,((F,I),((J,(H,D)),C))))));")),
... TreeNode.read(StringIO("(A,(B,(E,(G,((F,I),(((J,H),D),C))))));")),
... TreeNode.read(StringIO("(A,(B,(E,((F,I),(G,((J,(H,D)),C))))));")),
... TreeNode.read(StringIO("(A,(B,(E,((F,I),(G,(((J,H),D),C))))));")),
... TreeNode.read(StringIO("(A,(B,(E,((G,(F,I)),((J,(H,D)),C)))));")),
... TreeNode.read(StringIO("(A,(B,(E,((G,(F,I)),(((J,H),D),C)))));"))]
>>> consensus = majority_rule(trees, cutoff=0.5)[0]
>>> for node in sorted(consensus.non_tips(),
...                    key=lambda k: k.count(tips=True)):
...     support_value = node.support
...     names = ' '.join(sorted(n.name for n in node.tips()))
...     print("Tips: %s, support: %s" % (names, support_value))
Tips: F I, support: 9.0
Tips: D H J, support: 6.0
Tips: C D H J, support: 6.0
Tips: C D F G H I J, support: 6.0
Tips: C D E F G H I J, support: 9.0
Tips: B C D E F G H I J, support: 9.0

In the next example, multiple trees will be returned which can happen if clades are not well supported across the trees. In addition, this can arise if not all tips are present across all trees.

>>> trees = [
...     TreeNode.read(StringIO("((a,b),(c,d),(e,f));")),
...     TreeNode.read(StringIO("(a,(c,d),b,(e,f));")),
...     TreeNode.read(StringIO("((c,d),(e,f),b);")),
...     TreeNode.read(StringIO("(a,(c,d),(e,f));"))]
>>> consensus_trees = majority_rule(trees)
>>> len(consensus_trees)
4