skbio.metadata.Interval

class skbio.metadata.Interval(interval_metadata, bounds, fuzzy=None, metadata=None)[source]

Stores the bounds and metadata of an interval feature.

This class stores an interval feature. An interval feature is defined as a sub-region of a biological sequence or sequence alignment that is a functional entity, e.g., a gene, a riboswitch, an exon, etc. It can span a single contiguous region or multiple non-contiguous regions (e.g. multiple exons in a transcript, or multiple genes in an operon).

Parameters:

interval_metadata : object

A reference to the IntervalMetadata object that this Interval object is associated to.

bounds : iterable of tuple of int

Tuples representing start and end coordinates. It is zero-based numbering. It is always inclusive on start bound and exclusive on end bound.

fuzzy : iterable of tuple of bool, optional

Tuples representing the fuzziness of each bound coordinates. If this isn’t specified, then the fuzziness of all bound coordinates are False. If any of the coordinate fuzziness is True, it indicates that the exact bound point of a interval feature is unknown. The bound may begin or end at some points outside the specified coordinates. This accommodates the location format [R191] of INSDC.

metadata : dict, optional

Dictionary of attributes storing information of the feature such as “strand”, “gene_name”, or “product”.

Notes

While the construction of an Interval object automatically add itself to its associated IntervalMetadata object, IntervalMetadata.add is the typical/easier way to create and add it to IntervalMetadata.

References

[R191](1, 2) ftp://ftp.ebi.ac.uk/pub/databases/embl/doc/FT_current.html#3.4.3

Examples

Hypothetically, let’s say we have a gene called “genA” with 10 nt as shown in the following diagram. The second row represents the two exons (indicated by “=”) on this gene:

TGGATTCTGC
-====--==-
0123456789

We can create an Interval object to represent the exons of the gene:

>>> from skbio.metadata import Interval, IntervalMetadata
>>> interval_metadata = IntervalMetadata(10)

Remember the coordinates are inclusive in lower bound and exclusive on upper bound:

>>> gene = Interval(interval_metadata,
...                 bounds=[(1, 5), (7, 9)],
...                 metadata={'name': 'genA'})
>>> gene    
Interval(interval_metadata=..., bounds=[(1, 5), (7, 9)], fuzzy=[(False, False), (False, False)], metadata={'name': 'genA'})

Attributes

bounds The coordinates of the interval feature.
dropped Boolean value indicating if the Interval object is dropped.
fuzzy The openness of each coordinate.
metadata The metadata of the interval feature.

Methods

interval1 == interval2 Test if this Interval object is equal to another.
interval1 != interval2 Test if this Interval object is not equal to another.
drop() Drop this Interval object from the interval metadata it links to.