skbio.util.safe_md5

skbio.util.safe_md5(open_file, block_size=1048576)[source]

Computes an md5 sum without loading the file into memory

Parameters:

open_file : file object

open file handle to the archive to compute the checksum. It must be open as a binary file

block_size : int, optional

size of the block taken per iteration

Returns:

md5 : md5 object from the hashlib module

object with the loaded file

Notes

This method is based on the answers given in: http://stackoverflow.com/a/1131255/379593

Examples

>>> from StringIO import StringIO
>>> from skbio.util import safe_md5
>>> fd = StringIO("foo bar baz") # open file like object
>>> x = safe_md5(fd)
>>> x.hexdigest()
'ab07acbb1e496801937adfa772424bf7'
>>> fd.close()