How to turn a string with unquoted keys into a dict in Python

Question

I have a string

"{a:'b', c:'d',e:''}"

Please not that the keys to the dictionary entries are unquoted, so a simple eval("{a:'b', c:'d',e:''}") as suggested in a previous question does not work.

What would be the most convenient way to convert this string to a dictionary?

{'a':'b', 'c':'d', 'e':''}

@juanpa.arrivillaga: I'm guessing it's a Javascript object literal (_not_ JSON, which requires the property names to be quoted). — ShadowRanger, Jun 22 '16 at 00:31
None of the answers here currently work in the general case. Consider a tokenizer-based approach, as described by user2357112 answer [here](https://stackoverflow.com/a/52900880/674039). — wim, Oct 16 '19 at 17:23

score 7 · Answer 1 · edited May 23 '17 at 12:16

7

If this is from a trusted source (do not use this for general user input, as eval is not secure; but then again, if you're getting input from a potentially malicious user you should use JSON format and the json module instead), you can use eval with a trick.

source = """{e: '', a: 'b', c: 'd'}"""

class identdict(dict):
    def __missing__(self, key):
        return key

d = eval(source, identdict())
print(d)

prints

{'a': 'b', 'c': 'd', 'e': ''}

How this works is that we create a new dictionary subclass identdict that defines the magic method __missing__. This method is called for lookups on keys that are missing from the dictionary. In this case, we just return the key, so the dictionary maps keys to themselves. Then the source is evaluated using an identdict instance as the globals argument. eval will look up the values of variables from the globals mapping; as it is an identdict, the value of each variable accessed is conveniently now the name of the variable.

Works for even more complex strings as values, and anything that is proper Python literal syntax.

edited May 23 '17 at 12:16

Community

1
1

answered Jun 21 '16 at 06:01

Antti Haapala -- Слава Україні

129,958
22
279
321

Nice. Although my content is alphanumeric only, good to learn something new. As I already accepted the other answer, I only can upvote for you. Appreciate. – Harrison Jun 21 '16 at 06:10
@Harrison it is always possible for you to change the accepted answer – Antti Haapala -- Слава Україні Jun 21 '16 at 06:15
@AnttiHaapala Great little hack! But how exactly does this work? I'm not quite understanding how this gets past the `NameError`, shouldn't that be raised before the dictionary even deals with a a missing value? And how does it return a string? I added `print(type(key))` to the` __missing__` method, and it prints `` if I do something like `id[1]`, where `id` is an `identdict`, and if I do something like `id[x]` where x is not defined, I get a `NameError` and the `__missing__` method is never called! Clearly, I'm not understanding exactly what the `eval` built-in is doing. – juanpa.arrivillaga Jun 22 '16 at 01:32
@juanpa.arrivillaga no, the global variables in python are stored in a module-wise dictionary; additionally mapping of local variables can be provided to `eval`. Normally `eval` defaults to those mappings which are returned by calls to `globals()` and `locals()`; now instead we use a `identdict` instance. – Antti Haapala -- Слава Україні Jun 22 '16 at 04:17
This is a really underrated answer. Would be a really great example to quickly map the call hierarchy here. – Matthew Oct 16 '19 at 15:17
4

Hacky. Fails for `"{e: '', a: 'b', if: 'd'}"`. – wim Oct 16 '19 at 17:16

score 2 · Accepted Answer · answered Jun 21 '16 at 05:17

2

Depending on the complexity of what you're parsing, this could work:

s = "{a:'b', c:'d',e:''}"
d = dict([
    (x.split(':')[0].strip(), x.split(':')[1].strip("' "))
    for x in s.strip("{}").split(',')
])

answered Jun 21 '16 at 05:17

Will

24,082
14
97
108

score 1 · Answer 3 · answered May 13 '20 at 19:34

Manual parsing is error-prone and hard to make general, and eval-based approaches fail when the keys are Python keywords. The currently accepted answer breaks if values contain spaces, commas, or colons, and the eval answer can't handle keys like if or for.

Instead, we can tokenize the input as a series of Python tokens and replace NAME tokens with STRING tokens, then untokenize to build a valid dict literal. From there, we can just call ast.literal_eval.

import ast
import io
import tokenize

def parse(x):
    tokens = tokenize.generate_tokens(io.StringIO(x).readline)
    modified_tokens = (
        (tokenize.STRING, repr(token.string)) if token.type == tokenize.NAME else token[:2]
        for token in tokens)

    fixed_input = tokenize.untokenize(modified_tokens)

    return ast.literal_eval(fixed_input)

Then parse("{a:'b', c:'d',e:''}") == {'a':'b', 'c':'d', 'e':''}, and no problems occur with keywords as keys or special characters in the values:

>>> parse('{a: 2, if: 3}')
{'a': 2, 'if': 3}
>>> parse("{c: ' : , '}")
{'c': ' : , '}

score 0 · Answer 4 · edited May 23 '17 at 12:16

0

WARNING This approach will not work as desired if you have a key mapping to an empty string in the middle of your "dictionary." I'll not delete this answer because I think this approach might still be salvageable.

This might be a little more general than Will's answer, although, it is still going to depend on the exact structure of what you are parsing. If your key, value pairs will consist of alphanumeric words, you should be fine, though.

In [3]: import re

In [4]: import itertools

In [5]: my_string = "{a:'b', c:'d',e:''}"

In [6]: temp = re.findall(r"\w", my_string)

In [7]: temp = itertools.zip_longest(temp[0::2], temp[1::2], fillvalue = "")

In [8]: dict(temp)
Out[8]: {'a': 'b', 'c': 'd', 'e': ''}

If you want to know what is happening with the zip function, see these questions:

Collect every pair of elements from a list into tuples in Python

I used itertools.zip_longest so you can use a fill value, inspired by:

Pairs from single list

edited May 23 '17 at 12:16

Community

1
1

answered Jun 21 '16 at 05:32

juanpa.arrivillaga

88,713
10
131
172

Hmm, on second thought, this will not work as desired if you have a key mapping to an empty string in the middle of your "dictionary." Well, it might still be useful. – juanpa.arrivillaga Jun 21 '16 at 05:41
seems like someone downvoted everyone. I voted it back for you. – Harrison Jun 21 '16 at 06:05
1

This is very much dependent on the exact contents of the string. – Antti Haapala -- Слава Україні Jun 21 '16 at 06:06
Right, I've acknowledged this deficiency in the first comment. Should I edit the answer to include a warning? – juanpa.arrivillaga Jun 21 '16 at 06:14

score -1 · Answer 5 · answered Jun 21 '16 at 06:41

-1

import re
str="{a:'b', c:'d',e:''}"
dict([i.replace("'","").split(":") for i in re.findall(r"(\w+\:'.*?')",str)])

answered Jun 21 '16 at 06:41

John Doe

149
1
10

How to turn a string with unquoted keys into a dict in Python

5 Answers5

Linked