skbio.alignment.TabularMSA.gap_frequencies¶
-
TabularMSA.
gap_frequencies
(axis='sequence', relative=False)[source]¶ Compute frequency of gap characters across an axis.
State: Experimental as of 0.4.1.
- Parameters
axis ({'sequence', 'position'}, optional) – Axis to compute gap character frequencies across. If ‘sequence’ or 0, frequencies are computed for each position in the MSA. If ‘position’ or 1, frequencies are computed for each sequence.
relative (bool, optional) – If
True
, return the relative frequency of gap characters instead of the count.
- Returns
Vector of gap character frequencies across the specified axis. Will have
int
dtype ifrelative=False
andfloat
dtype ifrelative=True
.- Return type
- Raises
ValueError – If axis is invalid.
Notes
If there are no positions in the MSA,
axis='position'
, andrelative=True
, the relative frequency of gap characters in each sequence will benp.nan
.Examples
Compute frequency of gap characters for each position in the MSA (i.e., across the sequence axis):
>>> from skbio import DNA, TabularMSA >>> msa = TabularMSA([DNA('ACG'), ... DNA('A--'), ... DNA('AC.'), ... DNA('AG.')]) >>> msa.gap_frequencies() array([0, 1, 3])
Compute relative frequencies across the same axis:
>>> msa.gap_frequencies(relative=True) array([ 0. , 0.25, 0.75])
Compute frequency of gap characters for each sequence (i.e., across the position axis):
>>> msa.gap_frequencies(axis='position') array([0, 2, 1, 1])