I need to figure out how to write file output to a compressed file in Python, similar to the two-liner below:
open ZIPPED, "| gzip -c > zipped.gz";
print ZIPPED "Hello world\n";
In Perl, this uses Unix gzip to compress whatever you print to the ZIPPED filehandle to the file "zipped.gz".
I know how to use "import gzip" to do this in Python like this:
import gzip
zipped = gzip.open("zipped.gz", 'wb')
zipped.write("Hello world\n")
However, that is extremely slow. According to the profiler, using that method takes up 90% of my run time since I am writing 200GB of uncompressed data to various output files. I am aware that the file system could be part of the problem here, but I want to rule it out by using Unix/Linux compression instead. This is partially because I have heard that decompressing using this same module is slow as well.