skbio.parse.sequences.parse_fasta

skbio.parse.sequences.parse_fasta(infile, strict=True, label_to_name=<type 'str'>, finder=<function parser at 0x3d67cf8>, is_label=None, label_characters='>')[source]

yields label and seq from a fasta file.

Parameters:

data : open file object or str

An open fasta file or a path to it.

strict : bool

If strict is true a RecordError will be raised if no header line is found

Returns:

label, sequence : string

yields the label and sequence for each entry.

Examples

Assume we have a fasta formatted file with the following contents:

>seq1
CGATGTCGATCGATCGATCGATCAG
>seq2
CATCGATCGATCGATGCATGCATGCATG
>>> from StringIO import StringIO
>>> from skbio.parse.sequences import parse_fasta
>>> fasta_f = StringIO('>seq1\n'
...                    'CGATGTCGATCGATCGATCGATCAG\n'
...                    '>seq2\n'
...                    'CATCGATCGATCGATGCATGCATGCATG\n')
>>> for label, seq in parse_fasta(fasta_f):
...     print label
...     print seq
seq1
CGATGTCGATCGATCGATCGATCAG
seq2
CATCGATCGATCGATGCATGCATGCATG