27

I have a script that changes a dict to a string and saves it to a file. I'd like to then load that file and use it as a dict, but it's a string. Is there something like int("7") that can change a string formatted as a dict ({a: 1, b: 2}) into a dict? I've tried dict(), but this doesn't seem to be what it does. I've heard of some process involving JSON and eval(), but I don't really see what this does. The program loads the same data it saves, and if someone edits it and it doesn't work, it's no problem of mine (I don't need any advanced method of confirming the dict data or anything).

Óscar López
  • 232,561
  • 37
  • 312
  • 386
tkbx
  • 15,602
  • 32
  • 87
  • 122

4 Answers4

57

Try this, it's the safest way:

import ast
ast.literal_eval("{'x':1, 'y':2}")
=> {'y': 2, 'x': 1}

All the solutions based in eval() are dangerous, malicious code could be injected inside the string and get executed.

According to the documentation the expression gets evaluated safely. Also, according to the source code, literal_eval parses the string to a python AST (source tree), and returns only if it is a literal. The code is never executed, only parsed, so there is no reason for it to be a security risk.

Óscar López
  • 232,561
  • 37
  • 312
  • 386
  • 3
    I didn't know about `ast` module, thanks for this one! – Aif Dec 03 '12 at 01:35
  • Why is this safer? Isn't it susceptible to the same problem? – Matthew Adams Dec 03 '12 at 01:36
  • Oh, found the answer. It's because `ast.literal_eval` only works on literals. [source](http://docs.python.org/2/library/ast.html#ast.literal_eval) – Matthew Adams Dec 03 '12 at 01:39
  • 1
    @MatthewAdams according to the [documentation](http://docs.python.org/2/library/ast.html#ast.literal_eval) the expression gets evaluated safely. Also, according to the [source](http://hg.python.org/cpython/file/3.2/Lib/ast.py#l39), `literal_eval` parses the string to a python AST (source tree), and returns only if it is a literal. The code is never executed, only parsed, so there is no reason to be a security risk – Óscar López Dec 03 '12 at 01:40
  • Cool. I knew about `ast`, but I wouldn't have thought of using it here. +1 – Matthew Adams Dec 03 '12 at 01:42
  • Doesn't work for me on Python 3.4, the Yaml method worked fine. – Lior Magen Jan 02 '17 at 11:45
  • Wont work in this case- >>> ast.literal_eval(x) Traceback (most recent call last): File "", line 1, in NameError: name 'x' is not defined >>> x='{"A":true}' >>> ast.literal_eval(x) Traceback (most recent call last): File "", line 1, in File "/usr/lib/python2.7/ast.py", line 62, in return dict((_convert(k), _convert(v)) for k, v File "/usr/lib/python2.7/ast.py", line 79, in _convert raise ValueError('malformed string') ValueError: malformed string >>> y='{"A":True}' >>> ast.literal_eval(y) {'A': True} – Mohammad Shahid Siddiqui Nov 19 '17 at 18:31
15

This format is not JSON, but YAML, which you can parse with PyYAML:

>>> import yaml
>>> s = '{a: 1, b: 2}'
>>> d = yaml.load(s)
>>> d
{'a': 1, 'b': 2}
>>> type(d)
<type 'dict'>
phihag
  • 278,196
  • 72
  • 453
  • 469
  • This successfully converts it. Using json libraries or dict() on the string will fail. Thanks. – NuclearPeon Aug 27 '13 at 17:44
  • 1
    the yaml trick does not work well with dicts like In [12]: ast.literal_eval(s) Out[12]: {u'username': u'test'} In [13]: yaml.load(s) Out[13]: {"u'username'": "u'test'"} – Carl D'Halluin Sep 09 '15 at 13:38
8

You can use eval if you trust the input string.

>>> a=eval('{"a":1,"b":2}')
>>> a
{'a': 1, 'b': 2}
Matthew Adams
  • 9,426
  • 3
  • 27
  • 43
  • 2
    However, for the sake of completeness, it should be noted that using `eval` is generally a bad idea. – Nathan Dec 03 '12 at 01:33
  • And `ast.literal_eval` is sufficient here. Note that however, the format mentioned in the question (`{a: 1, b: 2}`) is not actually Python code. – phihag Dec 03 '12 at 01:34
  • Yeah I guess my caveat wasn't emphasized enough. – Matthew Adams Dec 03 '12 at 01:34
  • 1
    @kuyan what's wrong with `eval`? People act like string-to-dict is like feeding data from online directly into a root command line. – tkbx Dec 03 '12 at 01:36
  • 1
    @tkbx: `eval` would be okay if you're completely confident that the output is safe. But, if you don't, it could be as bad as feeding data from online directly into the command line - for example, if somebody swapped the expected input with an `os.system` call. – Nathan Dec 03 '12 at 01:40
  • 1
    @tkbx Haha, yeah but someone could swap out the file you're reading from with a file containing malicious python code and the `eval` in your program would happily run it. – Matthew Adams Dec 03 '12 at 01:40
  • @MatthewAdams weird... But it's for a text game, so if they want to edit the file and screw up their system, that's their problem. – tkbx Dec 03 '12 at 02:01
  • @tkbx But if your program enables someone else to screw up the system by editing the file, that's your problem too. – Matthew Adams Dec 03 '12 at 02:05
  • @MatthewAdams but what I mean is, the file is called "lv.data" and is stored in ~/.some_directory. It would be difficult to find on accident, and anyone who can find it would realize that it's not meant to be user-edited (or at least that they shouldn't inject code into it). – tkbx Dec 03 '12 at 12:36
  • 2
    @tkbx Is it really important in this specific situation? Probably not. But the principle is definitely important. What if the people that wrote Microsoft Word took the same approach? If a hacker discovered that all they had to do to gain control of a system was modify one file, then that makes their "job" a whole lot easier. It's good to code with security in mind. – Matthew Adams Dec 03 '12 at 14:54
4

Serialization

What you are talking about doing is object serialization, and there are better ways of doing it than rolling your own serialization method (though you seem to have come up with YAML). Both of these are still less secure than the ast.literal_eval() approach (pickle particularly), but they definitely should be noted here.

JSON

Here is an example of doing what you want using JSON, a popular cross-language format:

import json

myDict = {'a':1, 'b':2}

# write to the file 'data'
with open('data','w') as f:
    json.dump(myDict, f)

# now we can load it back
with open('data','r') as f:
    myDictLoaded = json.load(f)

print myDictLoaded

Output:

{u'a': 1, u'b': 2}

pickle

Here is a second example doing the same thing using pickle. pickle is more powerful in that it can serialize all* python objects, even ones you write.

import cPickle as pickle

myDict = {'a':1, 'b':2}

# write to the file 'data'
with open('data','w') as f:
    pickle.dump(myDict, f)

# now we can load it back
with open('data','r') as f:
    myDictLoaded = pickle.load(f)

print myDictLoaded

Output:

{'a': 1, 'b': 2}
Matthew Adams
  • 9,426
  • 3
  • 27
  • 43