DNA.
frequencies
(chars=None, relative=False)[source]¶Compute frequencies of characters in the sequence.
State: Experimental as of 0.4.1.
Parameters: | chars : str or set of str, optional
relative : bool, optional
|
---|---|
Returns: | dict
|
Raises: | TypeError
ValueError
ValueError
|
See also
Notes
If the sequence is empty (i.e., length zero), relative=True
,
and chars is provided, the relative frequency of each specified
character will be np.nan
.
If chars is not provided, this method is equivalent to, but faster
than, seq.kmer_frequencies(k=1)
.
If chars is not provided, it is equivalent to, but faster than,
passing chars=seq.observed_chars
.
Examples
Compute character frequencies of a sequence:
>>> from pprint import pprint
>>> from skbio import Sequence
>>> seq = Sequence('AGAAGACC')
>>> freqs = seq.frequencies()
>>> pprint(freqs) # using pprint to display dict in sorted order
{'A': 4, 'C': 2, 'G': 2}
Compute relative character frequencies:
>>> freqs = seq.frequencies(relative=True)
>>> pprint(freqs)
{'A': 0.5, 'C': 0.25, 'G': 0.25}
Compute relative frequencies of characters A, C, and T:
>>> freqs = seq.frequencies(chars={'A', 'C', 'T'}, relative=True)
>>> pprint(freqs)
{'A': 0.5, 'C': 0.25, 'T': 0.0}
Note that since character T is not in the sequence we receive a relative frequency of 0.0. The relative frequencies of A and C are relative to the number of characters in the sequence (8), not the number of A and C characters (4 + 2 = 6).