skbio.core.sequence.NucleotideSequence

class skbio.core.sequence.NucleotideSequence(sequence, id='', description='', validate=False)[source]

Base class for nucleotide sequences.

A NucleotideSequence is a BiologicalSequence with additional methods that are only applicable for nucleotide sequences, and containing only characters used in the IUPAC DNA or RNA lexicon.

See also

BiologialSequence

Notes

All uppercase and lowercase IUPAC DNA/RNA characters are supported.

Attributes

description Return the description of the BiologicalSequence
id Return the id of the BiologicalSequence

Methods

__contains__(other) The in operator.
__eq__(other) The equality operator.
__getitem__(i) The indexing operator.
__hash__() The hash operator.
__iter__() The iter operator.
__len__() The len operator.
__ne__(other) The inequality operator.
__repr__() The repr method.
__reversed__() The reversed operator.
__str__() The str operator
alphabet() Return the set of characters allowed in a BiologicalSequence.
complement() Return the complement of the NucleotideSequence
complement_map() Return the mapping of characters to their complements.
count(subsequence) Returns the number of occurences of subsequence.
degap() Returns a new BiologicalSequence with gaps characters removed.
distance(other[, distance_fn]) Returns the distance to other
fraction_diff(other) Return fraction of positions that differ relative to other
fraction_same(other) Return fraction of positions that are the same relative to other
gap_alphabet() Return the set of characters defined as gaps.
gap_maps() Return tuples mapping b/w gapped and ungapped positions
gap_vector() Return list indicating positions containing gaps
has_unsupported_characters() Return bool indicating presence/absence of unsupported characters
index(subsequence) Return the position where subsequence first occurs
is_gap(char) Return True if char is in the gap_alphabet set
is_gapped() Return True if char(s) in gap_alphabet are present
is_reverse_complement(other) Return True if other is the reverse complement of self
is_valid() Return True if the sequence is valid
iupac_characters() Return the non-degenerate and degenerate characters.
iupac_degeneracies() Return the mapping of degenerate to non-degenerate characters.
iupac_degenerate_characters() Return the degenerate IUPAC characters.
iupac_standard_characters() Return the non-degenerate IUPAC nucleotide characters.
k_word_counts(k[, overlapping, constructor]) Get the counts of words of length k
k_word_frequencies(k[, overlapping, constructor]) Get the frequencies of words of length k
k_words(k[, overlapping, constructor]) Get the list of words of length k
lower() Convert the BiologicalSequence to lowercase
nondegenerates() Yield all nondegenerate versions of the sequence.
rc() Return the reverse complement of the NucleotideSequence
reverse_complement() Return the reverse complement of the NucleotideSequence
to_fasta([field_delimiter, terminal_character]) Return the sequence as a fasta-formatted string
unsupported_characters() Return the set of unsupported characters in the BiologicalSequence
upper() Convert the BiologicalSequence to uppercase