skbio.core.distance.DissimilarityMatrix

class skbio.core.distance.DissimilarityMatrix(data, ids=None)[source]

Store dissimilarities between objects.

A DissimilarityMatrix instance stores a square, hollow, two-dimensional matrix of dissimilarities between objects. Objects could be, for example, samples or DNA sequences. A sequence of IDs accompanies the dissimilarities.

Methods are provided to load and save dissimilarity matrices from/to disk, as well as perform common operations such as extracting dissimilarities based on object ID.

Parameters:

data : array_like or DissimilarityMatrix

Square, hollow, two-dimensional numpy.ndarray of dissimilarities (floats), or a structure that can be converted to a numpy.ndarray using numpy.asarray. Can instead be a DissimilarityMatrix (or subclass) instance, in which case the instance’s data will be used. Data will be converted to a float dtype if necessary. A copy will not be made if already a numpy.ndarray with a float dtype.

ids : sequence of str, optional

Sequence of strings to be used as object IDs. Must match the number of rows/cols in data. If None (the default), IDs will be monotonically-increasing integers cast as strings, with numbering starting from zero, e.g., ('0', '1', '2', '3', ...).

See also

DistanceMatrix

Notes

The dissimilarities are stored in redundant (square-form) format [R49].

The data are not checked for symmetry, nor guaranteed/assumed to be symmetric.

References

[R49](1, 2) http://docs.scipy.org/doc/scipy/reference/spatial.distance.html

Attributes

data Array of dissimilarities.
ids Tuple of object IDs.
dtype Data type of the dissimilarities.
shape Two-element tuple containing the dissimilarity matrix dimensions.
size Total number of elements in the dissimilarity matrix.
T Transpose of the dissimilarity matrix.

Methods

__contains__(lookup_id) Check if the specified ID is in the dissimilarity matrix.
__eq__(other) Compare this dissimilarity matrix to another for equality.
__getitem__(index) Slice into dissimilarity data by object ID or numpy indexing.
__ne__(other) Determine whether two dissimilarity matrices are not equal.
__str__() Return a string representation of the dissimilarity matrix.
copy() Return a deep copy of the dissimilarity matrix.
filter(ids) Filter the dissimilarity matrix by IDs.
from_file(dm_f[, delimiter]) Load dissimilarity matrix from a delimited text file or file path.
index(lookup_id) Return the index of the specified ID.
redundant_form() Return an array of dissimilarities in redundant format.
to_file(out_f[, delimiter]) Save the dissimilarity matrix to file in delimited text format.
transpose() Return the transpose of the dissimilarity matrix.