skbio.alignment.
TabularMSA
(sequences, metadata=None, positional_metadata=None, minter=None, index=None)[source]¶Store a multiple sequence alignment in tabular (row/column) form.
Parameters: | sequences : iterable of GrammaredSequence, TabularMSA
metadata : dict, optional
positional_metadata : pd.DataFrame consumable, optional
minter : callable or metadata key, optional
index : pd.Index consumable, optional
|
---|---|
Raises: | ValueError
ValueError
TypeError
TypeError
ValueError
|
See also
skbio.sequence.DNA
, skbio.sequence.RNA
, skbio.sequence.Protein
, pandas.DataFrame
, pandas.Index
, reassign_index
Notes
If neither minter nor index are provided, default index labels will be
used: pd.RangeIndex(start=0, stop=len(sequences), step=1)
.
Examples
Create a TabularMSA
object with three DNA sequences and four positions:
>>> from skbio import DNA, TabularMSA
>>> seqs = [
... DNA('ACGT'),
... DNA('AG-T'),
... DNA('-C-T')
... ]
>>> msa = TabularMSA(seqs)
>>> msa
TabularMSA[DNA]
---------------------
Stats:
sequence count: 3
position count: 4
---------------------
ACGT
AG-T
-C-T
Since minter or index wasn’t provided, the MSA has default index labels:
>>> msa.index
RangeIndex(start=0, stop=3, step=1)
Create an MSA with metadata, positional metadata, and non-default index labels:
>>> msa = TabularMSA(seqs, index=['seq1', 'seq2', 'seq3'],
... metadata={'id': 'msa-id'},
... positional_metadata={'prob': [3, 4, 2, 2]})
>>> msa
TabularMSA[DNA]
--------------------------
Metadata:
'id': 'msa-id'
Positional metadata:
'prob': <dtype: int64>
Stats:
sequence count: 3
position count: 4
--------------------------
ACGT
AG-T
-C-T
>>> msa.index
Index(['seq1', 'seq2', 'seq3'], dtype='object')
Attributes
dtype |
Data type of the stored sequences. |
iloc |
Slice the MSA on either axis by index position. |
index |
Index containing labels along the sequence axis. |
loc |
Slice the MSA on first axis by index label, second axis by position. |
metadata |
dict containing metadata which applies to the entire object. |
positional_metadata |
pd.DataFrame containing metadata along an axis. |
shape |
Number of sequences (rows) and positions (columns). |
Methods
bool(msa) |
Boolean indicating whether the MSA is empty or not. |
x in msa |
Determine if an index label is in this MSA. |
copy.copy(msa) |
Return a shallow copy of this MSA. |
copy.deepcopy(msa) |
Return a deep copy of this MSA. |
msa1 == msa2 |
Determine if this MSA is equal to another. |
msa[x] |
Slice the MSA on either axis. |
iter(msa) |
Iterate over sequences in the MSA. |
len(msa) |
Number of sequences in the MSA. |
msa1 != msa2 |
Determine if this MSA is not equal to another. |
reversed(msa) |
Iterate in reverse order over sequences in the MSA. |
str(msa) |
String summary of this MSA. |
append (sequence[, minter, index, reset_index]) |
Append a sequence to the MSA without recomputing alignment. |
consensus () |
Compute the majority consensus sequence for this MSA. |
conservation ([metric, degenerate_mode, gap_mode]) |
Apply metric to compute conservation for all alignment positions |
extend (sequences[, minter, index, reset_index]) |
Extend this MSA with sequences without recomputing alignment. |
from_dict (dictionary) |
Create a TabularMSA from a dict . |
gap_frequencies ([axis, relative]) |
Compute frequency of gap characters across an axis. |
has_metadata () |
Determine if the object has metadata. |
has_positional_metadata () |
Determine if the object has positional metadata. |
iter_positions ([reverse, ignore_metadata]) |
Iterate over positions (columns) in the MSA. |
join (other[, how]) |
Join this MSA with another by sequence (horizontally). |
read (file[, format]) |
Create a new TabularMSA instance from a file. |
reassign_index ([mapping, minter]) |
Reassign index labels to sequences in this MSA. |
sort ([level, ascending]) |
Sort sequences by index label in-place. |
to_dict () |
Create a dict from this TabularMSA . |
write (file[, format]) |
Write an instance of TabularMSA to a file. |