skbio.alignment.Alignment.k_word_frequencies

Alignment.k_word_frequencies(k, overlapping=True)[source]

Return frequencies of length k words for sequences in Alignment

Parameters:

k : int

The word length.

overlapping : bool, optional

Defines whether the k-words should be overlapping or not overlapping. This is only relevant when k > 1.

Returns:

list

List of collections.defaultdict objects, one for each sequence in the Alignment, representing the frequency of each k word in each sequence of the Alignment.

Examples

>>> from skbio.alignment import Alignment
>>> from skbio.sequence import DNA
>>> sequences = [DNA('A', id="seq1"),
...              DNA('AT', id="seq2"),
...              DNA('TTTT', id="seq3")]
>>> s1 = SequenceCollection(sequences)
>>> for freqs in s1.k_word_frequencies(1):
...     print(freqs)
defaultdict(<type 'int'>, {'A': 1.0})
defaultdict(<type 'int'>, {'A': 0.5, 'T': 0.5})
defaultdict(<type 'int'>, {'T': 1.0})
>>> for freqs in s1.k_word_frequencies(2):
...     print(freqs)
defaultdict(<type 'int'>, {})
defaultdict(<type 'int'>, {'AT': 1.0})
defaultdict(<type 'int'>, {'TT': 1.0})