6

I have a string

"{a:'b', c:'d',e:''}"

Please not that the keys to the dictionary entries are unquoted, so a simple eval("{a:'b', c:'d',e:''}") as suggested in a previous question does not work.

What would be the most convenient way to convert this string to a dictionary?

{'a':'b', 'c':'d', 'e':''}
Lincoln
  • 1,008
  • 12
  • 20
Harrison
  • 2,560
  • 7
  • 29
  • 55
  • 1
    What is the source of those of such a string? – juanpa.arrivillaga Jun 21 '16 at 05:14
  • 2
    @juanpa.arrivillaga: I'm guessing it's a Javascript object literal (_not_ JSON, which requires the property names to be quoted). – ShadowRanger Jun 22 '16 at 00:31
  • None of the answers here currently work in the general case. Consider a tokenizer-based approach, as described by user2357112 answer [here](https://stackoverflow.com/a/52900880/674039). – wim Oct 16 '19 at 17:23

5 Answers5

7

If this is from a trusted source (do not use this for general user input, as eval is not secure; but then again, if you're getting input from a potentially malicious user you should use JSON format and the json module instead), you can use eval with a trick.

source = """{e: '', a: 'b', c: 'd'}"""

class identdict(dict):
    def __missing__(self, key):
        return key

d = eval(source, identdict())
print(d)

prints

{'a': 'b', 'c': 'd', 'e': ''}

How this works is that we create a new dictionary subclass identdict that defines the magic method __missing__. This method is called for lookups on keys that are missing from the dictionary. In this case, we just return the key, so the dictionary maps keys to themselves. Then the source is evaluated using an identdict instance as the globals argument. eval will look up the values of variables from the globals mapping; as it is an identdict, the value of each variable accessed is conveniently now the name of the variable.

Works for even more complex strings as values, and anything that is proper Python literal syntax.

Community
  • 1
  • 1
  • Nice. Although my content is alphanumeric only, good to learn something new. As I already accepted the other answer, I only can upvote for you. Appreciate. – Harrison Jun 21 '16 at 06:10
  • @Harrison it is always possible for you to change the accepted answer – Antti Haapala -- Слава Україні Jun 21 '16 at 06:15
  • @AnttiHaapala Great little hack! But how exactly does this work? I'm not quite understanding how this gets past the `NameError`, shouldn't that be raised before the dictionary even deals with a a missing value? And how does it return a string? I added `print(type(key))` to the` __missing__` method, and it prints `` if I do something like `id[1]`, where `id` is an `identdict`, and if I do something like `id[x]` where x is not defined, I get a `NameError` and the `__missing__` method is never called! Clearly, I'm not understanding exactly what the `eval` built-in is doing. – juanpa.arrivillaga Jun 22 '16 at 01:32
  • @juanpa.arrivillaga no, the global variables in python are stored in a module-wise dictionary; additionally mapping of local variables can be provided to `eval`. Normally `eval` defaults to those mappings which are returned by calls to `globals()` and `locals()`; now instead we use a `identdict` instance. – Antti Haapala -- Слава Україні Jun 22 '16 at 04:17
  • This is a really underrated answer. Would be a really great example to quickly map the call hierarchy here. – Matthew Oct 16 '19 at 15:17
  • 4
    Hacky. Fails for `"{e: '', a: 'b', if: 'd'}"`. – wim Oct 16 '19 at 17:16
2

Depending on the complexity of what you're parsing, this could work:

s = "{a:'b', c:'d',e:''}"
d = dict([
    (x.split(':')[0].strip(), x.split(':')[1].strip("' "))
    for x in s.strip("{}").split(',')
])
Will
  • 24,082
  • 14
  • 97
  • 108
1

Manual parsing is error-prone and hard to make general, and eval-based approaches fail when the keys are Python keywords. The currently accepted answer breaks if values contain spaces, commas, or colons, and the eval answer can't handle keys like if or for.

Instead, we can tokenize the input as a series of Python tokens and replace NAME tokens with STRING tokens, then untokenize to build a valid dict literal. From there, we can just call ast.literal_eval.

import ast
import io
import tokenize

def parse(x):
    tokens = tokenize.generate_tokens(io.StringIO(x).readline)
    modified_tokens = (
        (tokenize.STRING, repr(token.string)) if token.type == tokenize.NAME else token[:2]
        for token in tokens)

    fixed_input = tokenize.untokenize(modified_tokens)

    return ast.literal_eval(fixed_input)

Then parse("{a:'b', c:'d',e:''}") == {'a':'b', 'c':'d', 'e':''}, and no problems occur with keywords as keys or special characters in the values:

>>> parse('{a: 2, if: 3}')
{'a': 2, 'if': 3}
>>> parse("{c: ' : , '}")
{'c': ' : , '}
user2357112
  • 260,549
  • 28
  • 431
  • 505
0

WARNING This approach will not work as desired if you have a key mapping to an empty string in the middle of your "dictionary." I'll not delete this answer because I think this approach might still be salvageable.

This might be a little more general than Will's answer, although, it is still going to depend on the exact structure of what you are parsing. If your key, value pairs will consist of alphanumeric words, you should be fine, though.

In [3]: import re

In [4]: import itertools

In [5]: my_string = "{a:'b', c:'d',e:''}"

In [6]: temp = re.findall(r"\w", my_string)

In [7]: temp = itertools.zip_longest(temp[0::2], temp[1::2], fillvalue = "")

In [8]: dict(temp)
Out[8]: {'a': 'b', 'c': 'd', 'e': ''}

If you want to know what is happening with the zip function, see these questions:

Collect every pair of elements from a list into tuples in Python

I used itertools.zip_longest so you can use a fill value, inspired by:

Pairs from single list

Community
  • 1
  • 1
juanpa.arrivillaga
  • 88,713
  • 10
  • 131
  • 172
-1
import re
str="{a:'b', c:'d',e:''}"
dict([i.replace("'","").split(":") for i in re.findall(r"(\w+\:'.*?')",str)])
John Doe
  • 149
  • 1
  • 10