I am attempting to download the MNIST dataset and decode it without writing it to disk (mostly for fun).
request_stream = urlopen('http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz')
zip_file = GzipFile(fileobj=request_stream, mode='rb')
with zip_file as fd:
magic, numberOfItems = struct.unpack('>ii', fd.read(8))
rows, cols = struct.unpack('>II', fd.read(8))
images = np.fromfile(fd, dtype='uint8') # < here be dragons
images = images.reshape((numberOfItems, rows, cols))
return images
This code fails with OSError: obtaining file position failed
, an error that seems to be ungoogleable. What could the problem be?