skbio.sequence.DNA.concat

DNA.concat(sequences, how='strict')[source]

Concatenate an iterable of Sequence objects.

State: Experimental as of 0.4.1.

Parameters:

seqs : iterable (Sequence)

An iterable of Sequence objects or appropriate subclasses.

how : {‘strict’, ‘inner’, ‘outer’}, optional

How to intersect the positional_metadata of the sequences. If ‘strict’: the positional_metadata must have the exact same columns; ‘inner’: an inner-join of the columns (only the shared set of columns are used); ‘outer’: an outer-join of the columns (all columns are used: missing values will be padded with NaN).

Returns:

Sequence

The returned sequence will be an instance of the class which called this class-method.

Raises:

ValueError

If how is not one of: ‘strict’, ‘inner’, or ‘outer’.

ValueError

If how is ‘strict’ and the positional_metadata of each sequence does not have the same columns.

TypeError

If the sequences cannot be cast as the calling class.

Notes

The sequence-wide metadata (Sequence.metadata) is not retained during concatenation.

Sequence objects can be cast to a different type only when the new type is an ancestor or child of the original type. Casting between sibling types is not allowed, e.g. DNA -> RNA is not allowed, but DNA -> Sequence or Sequence -> DNA would be.

Examples

Concatenate two DNA sequences into a new DNA object:

>>> from skbio import DNA, Sequence
>>> s1 = DNA("ACGT")
>>> s2 = DNA("GGAA")
>>> DNA.concat([s1, s2])
DNA
-----------------------------
Stats:
    length: 8
    has gaps: False
    has degenerates: False
    has non-degenerates: True
    GC-content: 50.00%
-----------------------------
0 ACGTGGAA

Concatenate DNA sequences into a Sequence object (type coercion):

>>> Sequence.concat([s1, s2])
Sequence
-------------
Stats:
    length: 8
-------------
0 ACGTGGAA

Positional metadata is conserved:

>>> s1 = DNA('AcgT', lowercase='one')
>>> s2 = DNA('GGaA', lowercase='one',
...          positional_metadata={'two': [1, 2, 3, 4]})
>>> result = DNA.concat([s1, s2], how='outer')
>>> result
DNA
-----------------------------
Positional metadata:
    'one': <dtype: bool>
    'two': <dtype: float64>
Stats:
    length: 8
    has gaps: False
    has degenerates: False
    has non-degenerates: True
    GC-content: 50.00%
-----------------------------
0 ACGTGGAA
>>> result.positional_metadata
     one  two
0  False  NaN
1   True  NaN
2   True  NaN
3  False  NaN
4  False    1
5  False    2
6   True    3
7  False    4