skbio.alignment.TabularMSA.iter_positions

TabularMSA.iter_positions(reverse=False, ignore_metadata=False)[source]

Iterate over positions (columns) in the MSA.

State: Experimental as of 0.4.1.

Parameters
  • reverse (bool, optional) – If True, iterate over positions in reverse order.

  • ignore_metadata (bool, optional) – If True, Sequence.metadata and Sequence.positional_metadata will not be included. This can significantly improve performance if metadata is not needed.

Yields

Sequence – Each position in the order they are stored in the MSA.

Notes

Each position will be yielded as exactly a Sequence object, regardless of this MSA’s dtype. Sequence is used because a position is an artifact of multiple sequence alignment and is not a real biological sequence.

Each Sequence object will have its corresponding MSA positional metadata stored as metadata unless ignore_metadata is set to True.

Sequences will have their positional metadata concatenated using an outer join unless ignore_metadata is set to True. See Sequence.concat(how='outer') for details.

Examples

Create an MSA with positional metadata:

>>> from skbio import DNA, TabularMSA
>>> sequences = [DNA('ACG'),
...              DNA('A-T')]
>>> msa = TabularMSA(sequences,
...                  positional_metadata={'prob': [3, 1, 2]})

Iterate over positions:

>>> for position in msa.iter_positions():
...     position
...     print()
Sequence
-------------
Metadata:
    'prob': 3
Stats:
    length: 2
-------------
0 AA

Sequence
-------------
Metadata:
    'prob': 1
Stats:
    length: 2
-------------
0 C-

Sequence
-------------
Metadata:
    'prob': 2
Stats:
    length: 2
-------------
0 GT

Note that MSA positional metadata is stored as metadata on each Sequence object.

Iterate over positions in reverse order:

>>> for position in msa.iter_positions(reverse=True):
...     position
...     print('')
Sequence
-------------
Metadata:
    'prob': 2
Stats:
    length: 2
-------------
0 GT

Sequence
-------------
Metadata:
    'prob': 1
Stats:
    length: 2
-------------
0 C-

Sequence
-------------
Metadata:
    'prob': 3
Stats:
    length: 2
-------------
0 AA