I have looked through the information that the Python documentation for pickle gives, but I'm still a little confused. What would be some sample code that would write a new file and then use pickle to dump a dictionary into it?
-
6Read through this: http://www.doughellmann.com/PyMOTW/pickle/ and come back when you need a specific question – pyfunc Jun 27 '12 at 02:16
-
Check here first though http://stackoverflow.com/questions/5145664/storing-unpicklabe-pygame-surface-objects-in-external-files – John La Rooy Jun 27 '12 at 03:00
10 Answers
Try this:
import pickle
a = {'hello': 'world'}
with open('filename.pickle', 'wb') as handle:
pickle.dump(a, handle, protocol=pickle.HIGHEST_PROTOCOL)
with open('filename.pickle', 'rb') as handle:
b = pickle.load(handle)
print(a == b)
There's nothing about the above solution that is specific to a dict
object. This same approach will will work for many Python objects, including instances of arbitrary classes and arbitrarily complex nestings of data structures. For example, replacing the second line with these lines:
import datetime
today = datetime.datetime.now()
a = [{'hello': 'world'}, 1, 2.3333, 4, True, "x",
("y", [[["z"], "y"], "x"]), {'today', today}]
will produce a result of True
as well.
Some objects can't be pickled due to their very nature. For example, it doesn't make sense to pickle a structure containing a handle to an open file.

- 21,719
- 5
- 26
- 44

- 289,723
- 53
- 439
- 496
-
46
-
21@BallpointBen: It picks the highest protocol version your version of Python supports: https://docs.python.org/3/library/pickle.html#data-stream-format – Blender May 03 '18 at 00:53
-
5To make it more concise you can write `protocol=-1` (similar to -1 indexing in a list). – Matthew D. Scholefield Nov 01 '19 at 04:13
-
1If you are saving/loading a large object, please do use `pickle.HIGHEST_PROTOCOL`. Otherwise you may waste a lot of time and disk space. – Qin Heyang Sep 09 '22 at 00:38
-
1`HIGHEST_PROTOCOL` is subject to change, so it is better to choose a protocol and stick to it. Otherwise you will get your pickles deprecated and end up trying to figure out which protocol was used. – nurettin Mar 01 '23 at 11:12
-
@nurettin why would one need to figure out which protocol was used? The documentation for pickle.load reads "The protocol version of the pickle is detected automatically, so no protocol argument is needed.". In fact, pickle.load does not even have the option to specify the protocol. – Bastian Mar 24 '23 at 18:07
-
@Bastian oh, at the time I was thinking of loading the data parts from other languages. You can safely ignore my comment. Sorry for the confusion. – nurettin Mar 26 '23 at 19:28
Use:
import pickle
your_data = {'foo': 'bar'}
# Store data (serialize)
with open('filename.pickle', 'wb') as handle:
pickle.dump(your_data, handle, protocol=pickle.HIGHEST_PROTOCOL)
# Load data (deserialize)
with open('filename.pickle', 'rb') as handle:
unserialized_data = pickle.load(handle)
print(your_data == unserialized_data)
The advantage of HIGHEST_PROTOCOL
is that files get smaller. This makes unpickling sometimes much faster.
Important notice: The answer was written in 2015 (Python 3.4!). Back then, the maximum file size of pickle was about 2 GB.
Alternative way
import mpu
your_data = {'foo': 'bar'}
mpu.io.write('filename.pickle', data)
unserialized_data = mpu.io.read('filename.pickle')
Alternative Formats
- CSV: Super simple format (read & write)
- JSON: Nice for writing human-readable data; very commonly used (read & write)
- YAML: YAML is a superset of JSON, but easier to read (read & write, comparison of JSON and YAML)
- pickle: A Python serialization format (read & write)
- MessagePack (Python package): More compact representation (read & write)
- HDF5 (Python package): Nice for matrices (read & write)
- XML: exists too *sigh* (read & write)
For your application, the following might be important:
- Support by other programming languages
- Reading / writing performance
- Compactness (file size)
See also: Comparison of data serialization formats
In case you are rather looking for a way to make configuration files, you might want to read my short article Configuration files in Python

- 30,738
- 21
- 105
- 131

- 124,992
- 159
- 614
- 958
-
How did you determine the maximum limit? I was not aware of any limit and have pickled and unpickled 7GB in the past, without encountering anything suspicious. – Bastian Mar 24 '23 at 18:15
-
I don't remember exactly as this is more than 8 years ago. I think I just ran into an error message – Martin Thoma Mar 24 '23 at 18:17
-
-
I just pickled and unpickled a 4.7 GB file. You might want to remove the point. – Harshvardhan Jul 12 '23 at 22:14
-
Save a dictionary into a pickle file.
import pickle
favorite_color = {"lion": "yellow", "kitty": "red"} # create a dictionary
pickle.dump(favorite_color, open("save.p", "wb")) # save it into a file named save.p
# -------------------------------------------------------------
# Load the dictionary back from the pickle file.
import pickle
favorite_color = pickle.load(open("save.p", "rb"))
# favorite_color is now {"lion": "yellow", "kitty": "red"}

- 30,738
- 21
- 105
- 131

- 469
- 4
- 3
A simple way to dump Python data (e.g., a dictionary) to a pickle file:
import pickle
your_dictionary = {}
pickle.dump(your_dictionary, open('pickle_file_name.p', 'wb'))

- 30,738
- 21
- 105
- 131

- 536
- 4
- 8
In general, pickling a dict
will fail unless you have only simple objects in it, like strings and integers.
Python 2.7.9 (default, Dec 11 2014, 01:21:43)
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from numpy import *
>>> type(globals())
<type 'dict'>
>>> import pickle
>>> pik = pickle.dumps(globals())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 1374, in dumps
Pickler(file, protocol).dump(obj)
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 224, in dump
self.save(obj)
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 649, in save_dict
self._batch_setitems(obj.iteritems())
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 663, in _batch_setitems
save(v)
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 306, in save
rv = reduce(self.proto)
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/copy_reg.py", line 70, in _reduce_ex
raise TypeError, "can't pickle %s objects" % base.__name__
TypeError: can't pickle module objects
>>>
Even a really simple dict
will often fail. It just depends on the contents.
>>> d = {'x': lambda x:x}
>>> pik = pickle.dumps(d)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 1374, in dumps
Pickler(file, protocol).dump(obj)
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 224, in dump
self.save(obj)
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 649, in save_dict
self._batch_setitems(obj.iteritems())
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 663, in _batch_setitems
save(v)
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 748, in save_global
(obj, module, name))
pickle.PicklingError: Can't pickle <function <lambda> at 0x102178668>: it's not found as __main__.<lambda>
However, if you use a better serializer like dill
or cloudpickle
, then most dictionaries can be pickled:
>>> import dill
>>> pik = dill.dumps(d)
Or if you want to save your dict
to a file...
>>> with open('save.pik', 'w') as f:
... dill.dump(globals(), f)
...
The latter example is identical to any of the other good answers posted here (which aside from neglecting the picklability of the contents of the dict
are good).

- 30,738
- 21
- 105
- 131

- 33,715
- 8
- 119
- 139
Use:
>>> import pickle
>>> with open("/tmp/picklefile", "wb") as f:
... pickle.dump({}, f)
...
Normally it's preferable to use the cPickle implementation:
>>> import cPickle as pickle
>>> help(pickle.dump)
Help on built-in function dump in module cPickle:
dump(...)
dump(obj, file, protocol=0) -- Write an object in pickle format to the given file.
See the Pickler docstring for the meaning of optional argument proto.

- 30,738
- 21
- 105
- 131

- 295,403
- 53
- 369
- 502
If you just want to store the dict in a single file, use pickle
like this:
import pickle
a = {'hello': 'world'}
with open('filename.pickle', 'wb') as handle:
pickle.dump(a, handle)
with open('filename.pickle', 'rb') as handle:
b = pickle.load(handle)
If you want to save and restore multiple dictionaries in multiple files for
caching and store more complex data,
use anycache.
It does all the other stuff you need around pickle
from anycache import anycache
@anycache(cachedir='path/to/files')
def myfunc(hello):
return {'hello', hello}
Anycache stores the different myfunc
results, depending on the arguments to
different files in cachedir
and reloads them.
See the documentation for any further details.

- 30,738
- 21
- 105
- 131

- 651
- 8
- 4
FYI, Pandas has a method to save pickles now.
I find it easier.
pd.to_pickle(object_to_save,'/temp/saved_pkl.pickle' )

- 1,864
- 1
- 22
- 32
import pickle
dictobj = {'Jack' : 123, 'John' : 456}
filename = "/foldername/filestore"
fileobj = open(filename, 'wb')
pickle.dump(dictobj, fileobj)
fileobj.close()

- 4,512
- 8
- 25
- 37

- 37
- 3
If you want to handle writing or reading in one line without file opening:
import joblib
my_dict = {'hello': 'world'}
joblib.dump(my_dict, "my_dict.pickle") # write pickle file
my_dict_loaded = joblib.load("my_dict.pickle") # read pickle file

- 1,063
- 1
- 11
- 17
-
This is irrelevant, as OP did not ask about caching in this case. – Arka Mukherjee May 18 '22 at 00:18
-
2Where is caching here? It is saving the dictionary content into a pickle file as asked in the question. – gench Dec 05 '22 at 13:21