TabularMSA.
iloc
¶Slice the MSA on either axis by index position.
State: Experimental as of 0.4.1.
This will return an object with the following interface:
msa.iloc[seq_idx]
msa.iloc[seq_idx, pos_idx]
msa.iloc(axis='sequence')[seq_idx]
msa.iloc(axis='position')[pos_idx]
Parameters: | seq_idx : int, slice, iterable (int and slice), 1D array_like (bool)
pos_idx : (same as seq_idx), optional
axis : {‘sequence’, ‘position’, 0, 1, None}, optional
|
---|---|
Returns: | TabularMSA, GrammaredSequence, Sequence
|
See also
Notes
If the slice operation results in a TabularMSA
without any
sequences, the MSA’s positional_metadata
will be unset.
Examples
First we need to set up an MSA to slice:
>>> from skbio import TabularMSA, DNA
>>> msa = TabularMSA([DNA("ACGT"), DNA("A-GT"), DNA("AC-T"),
... DNA("ACGA")])
>>> msa
TabularMSA[DNA]
---------------------
Stats:
sequence count: 4
position count: 4
---------------------
ACGT
A-GT
AC-T
ACGA
When we slice by a scalar we get the original sequence back out of the MSA:
>>> msa.iloc[1]
DNA
--------------------------
Stats:
length: 4
has gaps: True
has degenerates: False
has definites: True
GC-content: 33.33%
--------------------------
0 A-GT
Similarly when we slice the second axis by a scalar we get a column of the MSA:
>>> msa.iloc[..., 1]
Sequence
-------------
Stats:
length: 4
-------------
0 C-CC
Note: we return an skbio.Sequence
object because the column of an
alignment has no biological meaning and many operations defined for the
MSA’s sequence dtype would be meaningless.
When we slice both axes by a scalar, operations are applied left to right:
>>> msa.iloc[0, 0]
DNA
--------------------------
Stats:
length: 1
has gaps: False
has degenerates: False
has definites: True
GC-content: 0.00%
--------------------------
0 A
In other words, it exactly matches slicing the resulting sequence object directly:
>>> msa.iloc[0][0]
DNA
--------------------------
Stats:
length: 1
has gaps: False
has degenerates: False
has definites: True
GC-content: 0.00%
--------------------------
0 A
When our slice is non-scalar we get back an MSA of the same dtype:
>>> msa.iloc[[0, 2]]
TabularMSA[DNA]
---------------------
Stats:
sequence count: 2
position count: 4
---------------------
ACGT
AC-T
We can similarly slice out a column of that:
>>> msa.iloc[[0, 2], 2]
Sequence
-------------
Stats:
length: 2
-------------
0 G-
Slice syntax works as well:
>>> msa.iloc[:3]
TabularMSA[DNA]
---------------------
Stats:
sequence count: 3
position count: 4
---------------------
ACGT
A-GT
AC-T
We can also use boolean vectors:
>>> msa.iloc[[True, False, False, True], 2:3]
TabularMSA[DNA]
---------------------
Stats:
sequence count: 2
position count: 1
---------------------
G
G
Here we sliced the first axis by a boolean vector, but then restricted the columns to a single column. Because the second axis was given a nonscalar we still recieve an MSA even though only one column is present.