skbio.stats.distance.pwmantel

skbio.stats.distance.pwmantel(dms, labels=None, method='pearson', permutations=999, alternative='two-sided', strict=True, lookup=None)[source]

Run Mantel tests for every pair of given distance matrices.

State: Experimental as of 0.4.0.

Runs a Mantel test for each pair of distance matrices and collates the results in a DataFrame. Distance matrices do not need to be in the same ID order if they are DistanceMatrix instances. Distance matrices will be re-ordered prior to running each pairwise test, and if strict=False, IDs that don’t match between a pair of distance matrices will be dropped prior to running the test (otherwise a ValueError will be raised if there are nonmatching IDs between any pair of distance matrices).

Parameters:

dms : iterable of DistanceMatrix objects, array_like objects, or filepaths

to distance matrices. If they are array_like, no reordering or matching of IDs will be performed.

labels : iterable of str or int, optional

Labels for each distance matrix in dms. These are used in the results DataFrame to identify the pair of distance matrices used in a pairwise Mantel test. If None, defaults to monotonically-increasing integers starting at zero.

method : {‘pearson’, ‘spearman’}

Correlation method. See mantel function for more details.

permutations : int, optional

Number of permutations. See mantel function for more details.

alternative : {‘two-sided’, ‘greater’, ‘less’}

Alternative hypothesis. See mantel function for more details.

strict : bool, optional

Handling of nonmatching IDs. See mantel function for more details.

lookup : dict, optional

Map existing IDs to new IDs. See mantel function for more details.

Returns:

pandas.DataFrame

DataFrame containing the results of each pairwise test (one per row). Includes the number of objects considered in each test as column n (after applying lookup and filtering nonmatching IDs if strict=False). Column p-value will display p-values as NaN if p-values could not be computed (they are stored as np.nan within the DataFrame; see mantel for more details).

Notes

Passing a list of filepaths can be useful as it allows for a smaller amount of memory consumption as it only loads two matrices at a time as opposed to loading all distance matrices into memory.

Examples

Import the functionality we’ll use in the following examples:

>>> from skbio import DistanceMatrix
>>> from skbio.stats.distance import pwmantel

Define three 3x3 distance matrices:

>>> x = DistanceMatrix([[0, 1, 2],
...                     [1, 0, 3],
...                     [2, 3, 0]])
>>> y = DistanceMatrix([[0, 2, 7],
...                     [2, 0, 6],
...                     [7, 6, 0]])
>>> z = DistanceMatrix([[0, 5, 6],
...                     [5, 0, 1],
...                     [6, 1, 0]])

Run Mantel tests for each pair of distance matrices (there are 3 possible pairs):

>>> pwmantel((x, y, z), labels=('x', 'y', 'z'),
...          permutations=0) 
             statistic p-value  n   method  permutations alternative
dm1 dm2
x   y     0.755929     NaN  3  pearson             0   two-sided
    z    -0.755929     NaN  3  pearson             0   two-sided
y   z    -0.142857     NaN  3  pearson             0   two-sided

Note that we passed permutations=0 to suppress significance tests; the p-values in the output are labelled NaN.