skbio.sequence.RNA.gc_frequency

RNA.gc_frequency(relative=False)[source]

Calculate frequency of G’s and C’s in the sequence.

State: Stable as of 0.4.0.

This calculates the minimum GC frequency, which corresponds to IUPAC characters G, C, and S (which stands for G or C).

Parameters

relative (bool, optional) – If False return the frequency of G, C, and S characters (ie the count). If True return the relative frequency, ie the proportion of G, C, and S characters in the sequence. In this case the sequence will also be degapped before the operation, so gap characters will not be included when calculating the length of the sequence.

Returns

Either frequency (count) or relative frequency (proportion), depending on relative.

Return type

int or float

See also

gc_content()

Examples

>>> from skbio import DNA
>>> DNA('ACGT').gc_frequency()
2
>>> DNA('ACGT').gc_frequency(relative=True)
0.5
>>> DNA('ACGT--..').gc_frequency(relative=True)
0.5
>>> DNA('--..').gc_frequency(relative=True)
0

S means G or C, so it counts:

>>> DNA('ASST').gc_frequency()
2

Other degenerates don’t count:

>>> DNA('RYKMBDHVN').gc_frequency()
0