How do I serialize a Python dictionary into a string, and then back to a dictionary? The dictionary will have lists and other dictionaries inside it.
-
Are you familiar with `pickle`? – Gabe Dec 03 '10 at 03:28
-
a module that is part of the Python Standard Library – Joachim Wagner Apr 16 '17 at 10:00
9 Answers
It depends on what you're wanting to use it for. If you're just trying to save it, you should use pickle
(or, if you’re using CPython 2.x, cPickle
, which is faster).
>>> import pickle
>>> pickle.dumps({'foo': 'bar'})
b'\x80\x03}q\x00X\x03\x00\x00\x00fooq\x01X\x03\x00\x00\x00barq\x02s.'
>>> pickle.loads(_)
{'foo': 'bar'}
If you want it to be readable, you could use json
:
>>> import json
>>> json.dumps({'foo': 'bar'})
'{"foo": "bar"}'
>>> json.loads(_)
{'foo': 'bar'}
json
is, however, very limited in what it will support, while pickle
can be used for arbitrary objects (if it doesn't work automatically, the class can define __getstate__
to specify precisely how it should be pickled).
>>> pickle.dumps(object())
b'\x80\x03cbuiltins\nobject\nq\x00)\x81q\x01.'
>>> json.dumps(object())
Traceback (most recent call last):
...
TypeError: <object object at 0x7fa0348230c0> is not JSON serializable

- 86,207
- 24
- 208
- 215
-
22
-
10I guess this -1 might be for not mentioning security problems inherent in pickling. See http://stackoverflow.com/questions/10282175/attacking-pythons-pickle – Piotr Dobrogost Oct 10 '14 at 11:05
-
It is worth mentioning that the cPickle part of the answer is not relevant for python 3.x. See [here](https://docs.python.org/3.1/whatsnew/3.0.html#library-changes) for the official explanation. In short, the accelerated C version of a package should be the default choice for any python module, and, if not available, the module itself falls back to the python implementation. This encapsulates the implementation from the user. Quote: `In Python 3.0... Users should always import the standard version, which attempts to import the accelerated version and falls back to the pure Python version.` – Ori Dec 25 '17 at 08:23
-
1"**Warning** The pickle module **is not secure**. Only unpickle data you trust." - [docs](https://docs.python.org/3/library/pickle.html) – ArtuX Feb 17 '20 at 08:24
Pickle is great but I think it's worth mentioning literal_eval
from the ast
module for an even lighter weight solution if you're only serializing basic python types. It's basically a "safe" version of the notorious eval
function that only allows evaluation of basic python types as opposed to any valid python code.
Example:
>>> d = {}
>>> d[0] = range(10)
>>> d['1'] = {}
>>> d['1'][0] = range(10)
>>> d['1'][1] = 'hello'
>>> data_string = str(d)
>>> print data_string
{0: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], '1': {0: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], 1: 'hello'}}
>>> from ast import literal_eval
>>> d == literal_eval(data_string)
True
One benefit is that the serialized data is just python code, so it's very human friendly. Compare it to what you would get with pickle.dumps
:
>>> import pickle
>>> print pickle.dumps(d)
(dp0
I0
(lp1
I0
aI1
aI2
aI3
aI4
aI5
aI6
aI7
aI8
aI9
asS'1'
p2
(dp3
I0
(lp4
I0
aI1
aI2
aI3
aI4
aI5
aI6
aI7
aI8
aI9
asI1
S'hello'
p5
ss.
The downside is that as soon as the the data includes a type that is not supported by literal_ast
you'll have to transition to something else like pickling.

- 20,575
- 8
- 83
- 77

- 9,790
- 11
- 46
- 44
-
literal_ast seem to me unable to encode instances of user defined classes, even the simplest one; the solution mentioned by @georg (namely pyYAML) do that and produces human readable serializations. From pypi.org project home (June 2021): `YAML is a data serialization format designed for human readability and interaction with scripting languages. PyYAML is a YAML parser and emitter for Python.` – Giovanni Faglia Jun 11 '21 at 13:09
Use Python's json module, or simplejson if you don't have python 2.6 or higher.

- 11,316
- 5
- 48
- 62

- 73,243
- 15
- 104
- 123
-
3+1: json is way better than pickle and can be used in the same way: `json.dumps(mydict)` and `json.loads(mystring)` – nosklo Dec 03 '10 at 10:19
-
11but json can only do strings, numbers, lists, and dictionaries while pickle can do any python type but json is far more portable then pickle for the types it can do – Dan D. Dec 03 '10 at 10:22
-
When you use `json.dumps()`, take care of some types (`False`, `True`, and `None`) because they are not compatible with `json` – Jason Heo Sep 09 '16 at 02:44
If you fully trust the string and don't care about python injection attacks then this is very simple solution:
d = { 'method' : "eval", 'safe' : False, 'guarantees' : None }
s = str(d)
d2 = eval(s)
for k in d2:
print k+"="+d2[k]
If you're more safety conscious then ast.literal_eval
is a better bet.

- 8,310
- 4
- 56
- 50
-
honestly this is the method I use all the time. thanks for sharing the safety tip. I use repr instead of str if the dictionary contains custom made objects that can be initialized by the repr string – Evan Pu Mar 18 '16 at 10:24
-
2You should use `ast.literal_eval` by default. `eval` has zero added values and a big security issue. – Jean-François Fabre Jan 12 '17 at 20:57
-
Bad things happen because persons honestly thought there is no security concerns in their particular peace of code, so they can just happly `eval` away. I'm just disgusted every time, someone promote this culture of sloppiness. Just use `json.dumps` and `json.loads` (or any other non-`eval` solution), there is no real reason not to – ArtuX Feb 17 '20 at 08:10
One thing json
cannot do is dict
indexed with numerals. The following snippet
import json
dictionary = dict({0:0, 1:5, 2:10})
serialized = json.dumps(dictionary)
unpacked = json.loads(serialized)
print(unpacked[0])
will throw
KeyError: 0
Because keys are converted to strings. cPickle
preserves the numeric type and the unpacked dict
can be used right away.

- 11,217
- 6
- 43
- 49

- 654
- 6
- 26
pyyaml should also be mentioned here. It is both human readable and can serialize any python object.
pyyaml is hosted here:
https://pypi.org/project/PyYAML

- 635
- 7
- 16
-
1See https://pypi.org/project/PyYAML/. Consider vulnerabilities though (eg: https://www.ibm.com/support/pages/security-bulletin-vulnerability-pyyaml-affects-ibm-spectrum-protect-plus-container-and-microsoft-file-systems-agents-cve-2020-1747) – Giovanni Faglia Jun 11 '21 at 13:14
-
While not strictly serialization, json may be reasonable approach here. That will handled nested dicts and lists, and data as long as your data is "simple": strings, and basic numeric types.

- 12,879
- 1
- 32
- 39
A new alternative to JSON or YaML is NestedText. It supports strings that are nested in lists and dictionaries to any depth. It conveys nesting through the use of indenting, and so has no need for either quoting or escaping. As such, the result tends to be very readable. The result looks like YaML, but without all the special cases. It is especially appropriate for serializing code snippets. For example, here is an a single test case extracted from a much larger set that was serialized with NestedText:
base tests:
-
args: --quiet --config test7 files -N configs/subdir
expected:
> Archive: test7-\d\d\d\d-\d\d-\d\dT\d\d:\d\d:\d\d
> «TESTS»/configs/subdir/
> «TESTS»/configs/subdir/file
Be aware, that integers, floats, and bools are converted to strings.

- 351
- 3
- 7
If you are trying to only serialize then pprint may also be a good option. It requires the object to be serialized and a file stream.
Here's some code:
from pprint import pprint
my_dict = {1:'a',2:'b'}
with open('test_results.txt','wb') as f:
pprint(my_dict,f)
I am not sure if we can deserialize easily. I was using json to serialize and deserialze earlier which works correctly in most cases.
f.write(json.dumps(my_dict, sort_keys = True, indent = 2, ensure_ascii=True))
However, in one particular case, there were some errors writing non-unicode data to json.

- 2,625
- 3
- 24
- 41