skbio.tree.nj

skbio.tree.nj(dm, disallow_negative_branch_length=True, result_constructor=None)[source]

Apply neighbor joining for phylogenetic reconstruction.

State: Experimental as of 0.4.0.

Parameters:

dm : skbio.DistanceMatrix

Input distance matrix containing distances between OTUs.

disallow_negative_branch_length : bool, optional

Neighbor joining can result in negative branch lengths, which don’t make sense in an evolutionary context. If True, negative branch lengths will be returned as zero, a common strategy for handling this issue that was proposed by the original developers of the algorithm.

result_constructor : function, optional

Function to apply to construct the result object. This must take a newick-formatted string as input. The result of applying this function to a newick-formatted string will be returned from this function. This defaults to lambda x: TreeNode.read(StringIO(x), format='newick').

Returns:

TreeNode

By default, the result object is a TreeNode, though this can be overridden by passing result_constructor.

Notes

Neighbor joining was initially described in Saitou and Nei (1987) [R281]. The example presented here is derived from the Wikipedia page on neighbor joining [R282]. The Phylip manual also describes the method [R283] and Phylip itself provides an implementation which is useful for comparison.

Neighbor joining, by definition, creates unrooted trees. One strategy for rooting the resulting trees is midpoint rooting, which is accessible as TreeNode.root_at_midpoint.

References

[R281](1, 2) Saitou N, and Nei M. (1987) “The neighbor-joining method: a new method for reconstructing phylogenetic trees.” Molecular Biology and Evolution. PMID: 3447015.
[R282](1, 2) http://en.wikipedia.org/wiki/Neighbour_joining
[R283](1, 2) http://evolution.genetics.washington.edu/phylip/doc/neighbor.html

Examples

Define a new distance matrix object describing the distances between five OTUs: a, b, c, d, and e.

>>> from skbio import DistanceMatrix
>>> from skbio.tree import nj
>>> data = [[0,  5,  9,  9,  8],
...         [5,  0, 10, 10,  9],
...         [9, 10,  0,  8,  7],
...         [9, 10,  8,  0,  3],
...         [8,  9,  7,  3,  0]]
>>> ids = list('abcde')
>>> dm = DistanceMatrix(data, ids)

Contstruct the neighbor joining tree representing the relationship between those OTUs. This is returned as a TreeNode object.

>>> tree = nj(dm)
>>> print(tree.ascii_art())
          /-d
         |
         |          /-c
         |---------|
---------|         |          /-b
         |          \--------|
         |                    \-a
         |
          \-e

Again, construct the neighbor joining tree, but instead return the newick string representing the tree, rather than the TreeNode object. (Note that in this example the string output is truncated when printed to facilitate rendering.)

>>> newick_str = nj(dm, result_constructor=str)
>>> print(newick_str[:55], "...")
(d:2.000000, (c:4.000000, (b:3.000000, a:2.000000):3.00 ...