1

I have seen quite a few links but mostly it gives me errors:

ValueError: Parse error: unable to parse: 'hover_data=["Confirmed","Deaths","Recovered"], animation_frame="Date",color_continuous_scale="Portland",radius=7, zoom=0,height=700"'

For example I want to convert the following string into a dict:

abc= 'fn=True, lat="Lat", lon="Long", hover_name="Country/Province/State",hover_data=["Confirmed","Deaths","Recovered"], animation_frame="Date",color_continuous_scale="Portland",radius=7, zoom=0,height=700"'

Expected output:

{'fn': True, "lat":"Lat", 
                        "lon":"Long", 
                        "hover_name":"Country/Province/State",
                        "hover_data":["Confirmed","Deaths","Recovered"], 
                        "animation_frame":"Date",
                        "color_continuous_scale":"Portland",
                        "radius":7, 
                        "zoom":0,
                        "height":700}

I tried to use this reference's code:

import re

keyval_re = re.compile(r'''
   \s*                                  # Leading whitespace is ok.
   (?P<key>\w+)\s*=\s*(                 # Search for a key followed by..
       (?P<str>"[^"]*"|\'[^\']*\')|     #   a quoted string; or
       (?P<float>\d+\.\d+)|             #   a float; or
       (?P<int>\d+)                     #   an int.
   )\s*,?\s*                            # Handle comma & trailing whitespace.
   |(?P<garbage>.+)                     # Complain if we get anything else!
   ''', re.VERBOSE)

def handle_keyval(match):
    if match.group('garbage'):
        raise ValueError("Parse error: unable to parse: %r" %
                         match.group('garbage'))
    key = match.group('key')
    if match.group('str') is not None:
        return (key, match.group('str')[1:-1]) # strip quotes
    elif match.group('float') is not None:
        return (key, float(match.group('float')))
    elif match.group('int') is not None:
        return (key, int(match.group('int')))

    elif match.group('list') is not None:
        return (key, int(match.group('list')))
    
    elif match.group('bool') is not None:
        return (key, int(match.group('bool')))

print(dict(handle_keyval(m) for m in keyval_re.finditer(abc)))
ilovewt
  • 911
  • 2
  • 10
  • 18
  • Note that in your example you have balanced brackets (`[` and `]`) and regular expression are ill-suited tool for dealing with them, see https://stackoverflow.com/questions/7898310/using-regex-to-balance-match-parenthesis – Daweo Jan 31 '21 at 09:29
  • What is your expected output? All keys and all values are strings in your expected output? – lllrnr101 Jan 31 '21 at 09:48
  • @lllrnr101I have updated the expected output thanks – ilovewt Jan 31 '21 at 09:51
  • @ilovewt -- There seems to be an unwanted double-quote character as the last character of your string `abc`. If that's some kind of typo error and is removed, my answer will work. – fountainhead Jan 31 '21 at 11:54

2 Answers2

1

There seems to be an unwanted double-quote character as the last character of your string abc.

If that is removed, the following solution will work nicely:

eval("dict(" + abc + ")")

Output:

{'fn': True,
 'lat': 'Lat',
 'lon': 'Long',
 'hover_name': 'Country/Province/State',
 'hover_data': ['Confirmed', 'Deaths', 'Recovered'],
 'animation_frame': 'Date',
 'color_continuous_scale': 'Portland',
 'radius': 7,
 'zoom': 0,
 'height': 700}
fountainhead
  • 3,584
  • 1
  • 8
  • 17
  • Works like a charm! But one quick question, `eval` seems to automatically sort my keys in alphabetical order, is there a way to bypass it without sorting the dictionary again after calling `eval`? – ilovewt Jan 31 '21 at 11:58
  • @ilovewt Well, I can't reproduce that behavior. As you can see in my output, the keys in the dictionary are in the same order, as they were present in the original string `abc`. – fountainhead Jan 31 '21 at 12:02
  • Seems like I need to explicitly call `print(eval(...))` to get back the exact same order, calling `eval(...)` alone in an IDE will cause the keys to be sorted, somehow. – ilovewt Jan 31 '21 at 12:58
  • `d = eval("dict(" + abc + ")"); print (d.keys())` shows that the keys are in the input order. That's all the assurance we need, I think. – fountainhead Jan 31 '21 at 13:08
1

⚠️ DON'T USE EVAL.

import re, ast

test_string = 'fn=True, lat="Lat", lon="Long", hover_name="Country/Province/State",hover_data=["Confirmed","Deaths","Recovered"], animation_frame="Date",color_continuous_scale="Portland",radius=7, zoom=0,height=700'
items = re.split(r', |,(?=\w)', test_string)

d = {
    key: ast.literal_eval(val)
    for item in items
    for key, val in [re.split(r'=|\s*=\s*', item)]
}

print(d)

I used a very simple method. Just splitted the string on , and then plain dict comprehension. I've also used ast.literal_eval() to convert strings into their respective keywords and data types.

Tsubasa
  • 1,389
  • 11
  • 21
  • Thanks for the solution, would you be able to provide a more complete snippet such that edge cases like "fn = True, ..." will get solved automatically as well, since you are solely splitting on `=` it might be the case whereby there are unwanted whitespaces/empty spaces. – ilovewt Jan 31 '21 at 12:26
  • @ilovewt in that case, splitting with regular expressions come very handy. Just use `re.split()`. I've updated my code :) – Tsubasa Jan 31 '21 at 12:41
  • @Xua -- I've gone through the SO thread you've linked in your answer. I've gone through all the answers posted on that thread. I've gone through the comments under each answer and under the question. CLEARLY, the risk exists only if the string being passed to `eval` comes externally, such as from user-input. The upshot of that thread is CLEARLY "avoid using it blindly, and avoid running away from it blindly". It's a pity people mis-quote and propagate the myth. – fountainhead Jan 31 '21 at 12:43
  • Then how can you be so sure that the string in the OP's question not coming externally? – Tsubasa Jan 31 '21 at 12:47
  • @Xua -- No, I'm not sure that it it's not coming from insecure sources. Nor can you be sure that it is coming from insure sources. I dont have a problem if someone says avoid using it, or use it under these specific circumstances only. I have a problem with people just bluntly saying "DONT USE". As long as the string was put in there by a programmer, it is as secure as the rest of the program itself, no less, no more. – fountainhead Jan 31 '21 at 12:48
  • Yes I know, but, looking at the string I can say that this is probably not the kind of string any humans will going to write (ever?). This is a help forum so we need at least a minimum data to post our solution. Hence, you see, the string was put. – Tsubasa Jan 31 '21 at 12:57
  • @Xua -- "looking at the string I can say that this is probably not the kind of strings any humans will going to write (ever?)" That's exactly my point. It definitely is NOT the kind of string that the user will be entering on the console. It 's the kind of string that will more likely be coming from a programmer or a program output, explicitly put there by a programmer. – fountainhead Jan 31 '21 at 13:01
  • Or this can also be a fragment of any scrapped data in which case, it may not be secure. :) – Tsubasa Jan 31 '21 at 13:06