Tuesday, February 5, 2013

Python: Reading large bz2 file with bz2.BZ2File()

There might arise a problem of partial (incomplete) reading of a file while reading a bz2 file in python.

The tip to overcome such a problem is very simple, uncompress the bz2 file using extraction utility (Ubuntu has the graphical utility by default). Once extracted, zip it back as bz2 and now try reading it again, this time you may have solved the problem.

Reason for the problem: the side that produced the bz2 file may have produced the bz2 file from multiple files which is not well recognized by bz2.BZ2File() functionality in python.