skbio.sequence.DNA.gc_content

DNA.gc_content()[source]

Calculate the relative frequency of G’s and C’s in the sequence.

State: Stable as of 0.4.0.

This includes G, C, and S characters. This is equivalent to calling gc_frequency(relative=True). Note that the sequence will be degapped before the operation, so gap characters will not be included when calculating the length of the sequence.

Returns:

float

Relative frequency of G’s and C’s in the sequence.

See also

gc_frequency

Examples

>>> from skbio import DNA
>>> DNA('ACGT').gc_content()
0.5
>>> DNA('ACGTACGT').gc_content()
0.5
>>> DNA('ACTTAGTT').gc_content()
0.25
>>> DNA('ACGT--..').gc_content()
0.5
>>> DNA('--..').gc_content()
0

S means G or C, so it counts:

>>> DNA('ASST').gc_content()
0.5

Other degenerates don’t count:

>>> DNA('RYKMBDHVN').gc_content()
0.0