Class for storing collections of biological sequences.
Parameters: | seqs : list of skbio.sequence.BiologicalSequence objects
validate : bool, optional
|
---|---|
Raises: | skbio.alignment.SequenceCollectionError
|
See also
skbio.sequence.BiologicalSequence, skbio.sequence.NucleotideSequence, skbio.sequence.DNASequence, skbio.sequence.RNASequence, Alignment, skbio.parse.sequences, skbio.parse.sequences.parse_fasta
Examples
>>> from skbio.alignment import SequenceCollection
>>> from skbio.sequence import DNA
>>> sequences = [DNA('ACCGT', id="seq1"),
... DNA('AACCGGT', id="seq2")]
>>> s1 = SequenceCollection(sequences)
>>> s1
<SequenceCollection: n=2; mean +/- std length=6.00 +/- 1.00>
Methods
__contains__(id) | The in operator. |
__eq__(other) | The equality operator. |
__getitem__(index) | The indexing operator. |
__iter__() | The iter operator. |
__len__() | The len operator. |
__ne__(other) | The inequality operator. |
__repr__() | The repr method. |
__reversed__() | The reversed method. |
__str__() | The str method. |
degap() | Return a new SequenceCollection with all gap characters removed. |
distances(distance_fn) | Compute distances between all pairs of sequences |
distribution_stats([center_f, spread_f]) | Return sequence count, and center and spread of sequence lengths |
from_fasta_records(fasta_records, ...[, ...]) | Initialize a SequenceCollection object |
get_seq(id) | Return a sequence from the SequenceCollection by its id. |
ids() | Returns the BiologicalSequence ids |
int_map([prefix]) | Create an integer-based mapping of sequence ids |
is_empty() | Return True if the SequenceCollection is empty |
is_valid() | Return True if the SequenceCollection is valid |
iteritems() | Generator of id, sequence tuples |
k_word_frequencies(k[, overlapping]) | Return k-word frequencies for sequences in SequenceCollection. |
lower() | Converts all sequences to lowercase |
read(fp[, format]) | Create a new SequenceCollection instance from a file. |
sequence_count() | Return the count of sequences in the SequenceCollection |
sequence_lengths() | Return lengths of the sequences in the SequenceCollection |
toFasta() | Return fasta-formatted string representing the SequenceCollection |
to_fasta() | Return fasta-formatted string representing the SequenceCollection |
update_ids([ids, fn, prefix]) | Update sequence IDs on the sequence collection. |
upper() | Converts all sequences to uppercase |
write(fp[, format]) | Write an instance of SequenceCollection to a file. |