13

I'm trying to figure how to use mmap with a gzip compressed file. Is that even possible ?

import mmap
import os
import gzip

filename = r'C:\temp\data.gz'

file = gzip.open(filename, "rb+")
size = os.path.getsize(filename)

file = mmap.mmap(file.fileno(), size)

print file.read(8)

The output data is compressed.

Uwe Keim
  • 39,551
  • 56
  • 175
  • 291
mab
  • 151
  • 1
  • 6
  • Seems to be no C++ or C#, since the `;` are missing at the end of the line. Maybe Python or Ruby? – Uwe Keim Feb 26 '11 at 16:03
  • @uwe, that import syntax & those library functions is python – tobyodavies Feb 26 '11 at 16:07
  • Are you looking for zlib? zlib is the same algorithm as gzip, though you might have to twiddle some setting to get it working exactly the same. – wisty Feb 26 '11 at 16:27

2 Answers2

18

You can do easilly. Indeed the gzip module gets as optional argument a file-like object.

import mmap
import gzip

filename = "a.gz"
handle = open(filename, "rb")
mapped = mmap.mmap(handle.fileno(), 0, access=mmap.ACCESS_READ)
gzfile = gzip.GzipFile(mode="r", fileobj=mapped)

print gzfile.read()

The same applies to tarfile module:

import sys
import mmap
import tarfile

f = open(sys.argv[1], 'rb')
fo = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
tf = tarfile.open(mode='r:gz', fileobj=fo)

print tf.getnames()
nopper
  • 825
  • 11
  • 18
17

Well, not the way you want.

mmap() can be used to access the gzipped file if the compressed data is what you want.

mmap() is a system call for mapping disk blocks into RAM almost as if you were adding swap.

You can't map the uncompressed data into RAM with mmap() as it is not on the disk.

Joshua
  • 40,822
  • 8
  • 72
  • 132