5

I'm a complete novice with pickle, and I have a bunch of (about 100,000) images that need to be pickled.

They are first loaded as image object, and converted to data as following:

image = {
    'pixels': im.tostring(),
    'size': im.size,
    'mode': im.mode,
}

Now how do I pickle them into one pkl file?

martineau
  • 119,623
  • 25
  • 170
  • 301
ytrewq
  • 3,670
  • 9
  • 42
  • 71
  • Is there a reason you want to? Why not store them in, say, PNG format instead of pickle? It'll be a lot smaller, and probably load faster too. – abarnert Aug 03 '14 at 12:28
  • Also, what part are you not getting here? If you know how to pickle things, how do you not know how to pickle this dict? – abarnert Aug 03 '14 at 12:31
  • @abarnert I need to run it with http://deeplearning.net/tutorial/code/rbm.py this code, and it seems to be working with pkl.. – ytrewq Aug 03 '14 at 12:34
  • @abarnert I'm a complete novice with pickle, and don't know how to pickle things in the first place. – ytrewq Aug 03 '14 at 12:35
  • 1
    Have you read [the docs](https://docs.python.org/3/library/pickle.html)? – abarnert Aug 03 '14 at 12:38

2 Answers2

5

You could do it like this:

file = open('data.pkl', 'wb')

# Pickle dictionary using protocol 0.
pickle.dump(image, file)
file.close()

To read in the dictionary, you can do it like this:

file = open('data.pkl', 'rb')

image = pickle.load(pkl_file)
print image
file.close()

It is also possible to dump the data twice:

import pickle

# Write to file.
file = open("data.pkl", "wb")
pickle.dump(image1, file)
pickle.dump(image2, file)
file.close()

# Read from file.
file = open("data.pkl", "rb")
image1 = pickle.load(file)
image2 = pickle.load(file)
file.close()
miindlek
  • 3,523
  • 14
  • 25
  • if I have many image objects to pickle, then should I first put them in an array and then pickle it? or should I just keep dumping? – ytrewq Aug 03 '14 at 12:33
  • Yes, your should. But it is also possible to dump the data twice. Look at my update. – miindlek Aug 03 '14 at 12:36
  • 1
    @CosmicRabbitMediaInc: Well, if you have many image objects to pickle, they should probably _already_ be in a list. If you have the same code copied and pasted 40 times to use `image_1`, `image_2`, etc., don't do that. – abarnert Aug 03 '14 at 12:36
  • @CosmicRabbitMediaInc: You can write multiple object to the same pickle file, one by one, or put them all in a container like a list and write just that. The latter would probably be quicker. See [my answer](http://stackoverflow.com/a/4529901/355230) to the question yours was marked as duplicating. – martineau Dec 21 '14 at 20:53
2

Just call pickle.dump, the same way you would for anything else. You have a dict whose values are all simple types (strings, tuples of a couple numbers, etc.). The fact that it came from an image is irrelevant.

If you have a bunch of them, presumably they're stored in a list or some other structure, and you can pickle a list of pickleable objects.

So:

with open('data.pkl', 'wb') as f:
    pickle.dump(images, f)
abarnert
  • 354,177
  • 51
  • 601
  • 671