scikit-bio is an open-source, BSD-licensed, python package providing data structures, algorithms, and educational resources for bioinformatics.

scikit-bio is currently in alpha. We are very actively developing it, and backwards-incompatible interface changes can and will arise. Once the API has started to solidify, we will strive to maintain backwards compatibility. We will provide deprecation warnings wherever possible in the scikit-bio code, documentation, and

Note: Deprecation warnings will be issued using Python's DeprecationWarning class. Since Python 2.7, these types of warnings are silenced by default. When developing a tool that uses scikit-bio, we recommend enabling the display of deprecation warnings to be informed of upcoming API changes. For details on how to display deprecation warnings, see Python's deprecation warning docs.

Installation of release version (recommended for most users)

To install the latest release version of scikit-bio you should run:

pip install numpy
pip install scikit-bio

Equivalently, you can use the conda package manager available in Anaconda or miniconda to install scikit-bio and all its dependencies, without having to compile them:

conda install scikit-bio

Finally, most scikit-bio's dependencies (in particular, the ones that are trickier to build) are also available, albeit only for Python 2, in Canopy Express.

You can verify your installation by running the scikit-bio unit tests as follows:

nosetests --with-doctest skbio

Installation of development version

If you're interested in working with the latest development release of scikit-bio (recommended for developers only, as the development code can be unstable and less documented than the release code), you can clone the repository and install as follows. This will require that you have git installed.

git clone
cd scikit-bio
pip install .

After this completes, you can run the scikit-bio unit tests as follows. You must first cd out of the scikit-bio directory for the tests to pass (here we cd to the home directory).

nosetests --with-doctest skbio

For developers of scikit-bio, if you don't want to be forced to re-install after every change, you can modify the above pip install command to:

pip install -e .

This will build scikit-bio's Cython extensions, and will create a link in the site-packages directory to the scikit-bio source directory. When you then make changes to code in the source directory, those will be used (e.g., by the unit tests) without re-installing.

Finally, if you don't want to use pip to install scikit-bio, and prefer to just put scikit-bio in your $PYTHONPATH, at the minimum you should run:

python build_ext --inplace

This will build scikit-bio's Cython extensions, but not create a link to the scikit-bio source directory in site-packages. If this isn't done, using certain components of scikit-bio will be inefficient and will produce an EfficiencyWarning.

Getting help

To get help with scikit-bio, you should use the skbio tag on StackOverflow (SO). Before posting a question, check out SO's guide on how to ask a question. The scikit-bio developers regularly monitor the skbio SO tag.


scikit-bio is available under the new BSD license. See COPYING.txt for scikit-bio's license, and the licenses directory for the licenses of third-party software that is (either partially or entirely) distributed with scikit-bio.

Projects using scikit-bio

Some of the projects that we know of that are using scikit-bio are:

If you're using scikit-bio in your own projects, you can issue a pull request to add them to this list.

scikit-bio development

If you're interested in getting involved in or learning about scikit-bio development, see

See the list of all of scikit-bio's contributors.

Summaries of our weekly developer meetings are posted on HackPad. Click here to view the meeting notes for 2014.

The pre-history of scikit-bio

scikit-bio began from code derived from PyCogent and QIIME, and the contributors and/or copyright holders have agreed to make the code they wrote for PyCogent and/or QIIME available under the BSD license. The contributors to PyCogent and/or QIIME modules that have been ported to scikit-bio are: Rob Knight (@rob-knight), Gavin Huttley (@gavin-huttley), Daniel McDonald (@wasade), Micah Hamady, Antonio Gonzalez (@antgonza), Sandra Smit, Greg Caporaso (@gregcaporaso), Jai Ram Rideout (@ElBrogrammer), Cathy Lozupone (@clozupone), Mike Robeson (@mikerobeson), Marcin Cieslik, Peter Maxwell, Jeremy Widmann, Zongzhi Liu, Michael Dwan, Logan Knecht (@loganknecht), Andrew Cochran, Jose Carlos Clemente (@cleme), Damien Coy, Levi McCracken, Andrew Butterfield, Will Van Treuren (@wdwvt1), Justin Kuczynski (@justin212k), Jose Antonio Navas Molina (@josenavas), Matthew Wakefield (@genomematt) and Jens Reeder (@jensreeder).


scikit-bio's logo was created by Alina Prassas. scikit-bio's ASCII art tree was created by @gregcaporaso. Our text logo was created at