90

I have a string that looks like this:

"Name1=Value1;Name2=Value2;Name3=Value3"

Is there a built-in class/function in Python that will take that string and construct a dictionary, as though I had done this:

dict = {
    "Name1": "Value1",
    "Name2": "Value2",
    "Name3": "Value3"
}

I have looked through the modules available but can't seem to find anything that matches.


Thanks, I do know how to make the relevant code myself, but since such smallish solutions are usually mine-fields waiting to happen (ie. someone writes: Name1='Value1=2';) etc. then I usually prefer some pre-tested function.

I'll do it myself then.

Lasse V. Karlsen
  • 380,855
  • 102
  • 628
  • 825
  • does your question require to support `s = r'Name1='Value=2';Name2=Value2;Name3=Value3;Name4="Va\"lue;\n3"'` input (note: a semicolon inside a quoted string, a quote is escaped using a backslash, `\n` escape is used, both single and double quotes are used)? – jfs Dec 21 '14 at 20:42
  • This question of mine is over 6 years old, the code which involved this has long since been replaced :) And no, it didn't require support for quotes. I just wanted to have a prebuilt function instead of writing something myself. However, the code is long gone. – Lasse V. Karlsen Dec 21 '14 at 20:43

6 Answers6

153

There's no builtin, but you can accomplish this fairly simply with a generator comprehension:

s= "Name1=Value1;Name2=Value2;Name3=Value3"
dict(item.split("=") for item in s.split(";"))

[Edit] From your update you indicate you may need to handle quoting. This does complicate things, depending on what the exact format you are looking for is (what quote chars are accepted, what escape chars etc). You may want to look at the csv module to see if it can cover your format. Here's an example: (Note that the API is a little clunky for this example, as CSV is designed to iterate through a sequence of records, hence the .next() calls I'm making to just look at the first line. Adjust to suit your needs):

>>> s = "Name1='Value=2';Name2=Value2;Name3=Value3"

>>> dict(csv.reader([item], delimiter='=', quotechar="'").next() 
         for item in csv.reader([s], delimiter=';', quotechar="'").next())

{'Name2': 'Value2', 'Name3': 'Value3', 'Name1': 'Value1=2'}

Depending on the exact structure of your format, you may need to write your own simple parser however.

Brian
  • 116,865
  • 28
  • 107
  • 112
6

This comes close to doing what you wanted:

>>> import urlparse
>>> urlparse.parse_qs("Name1=Value1;Name2=Value2;Name3=Value3")
{'Name2': ['Value2'], 'Name3': ['Value3'], 'Name1': ['Value1']}
Kyle Gibson
  • 1,150
  • 1
  • 9
  • 12
  • 3
    it breaks if there is `&` or `%` in the input. – jfs Dec 21 '14 at 20:59
  • @jfs but the string does not contain either of those. – Vishal Singh Aug 23 '20 at 08:08
  • 3
    @VishalSingh: most visitors on StackOverflow are from google and therefore answers here are not only for the original poster who asked the question. If I came here looking for how to parse a "semicolon-separated string to a dictionary, in Python" then my strings might contain `&` or `%` -- at the very least, it is worth mentioning that the answer doesn't work for such strings. – jfs Aug 24 '20 at 16:28
4
s1 = "Name1=Value1;Name2=Value2;Name3=Value3"

dict(map(lambda x: x.split('='), s1.split(';')))
Petter Friberg
  • 21,252
  • 9
  • 60
  • 109
D. Om
  • 41
  • 1
1

It can be simply done by string join and list comprehension

",".join(["%s=%s" % x for x in d.items()])

>>d = {'a':1, 'b':2}
>>','.join(['%s=%s'%x for x in d.items()])
>>'a=1,b=2'
Vishal Singh
  • 6,014
  • 2
  • 17
  • 33
vijay
  • 679
  • 2
  • 7
  • 15
-2
easytiger $ cat test.out test.py | sed 's/^/    /'
p_easytiger_quoting:1.84563302994
{'Name2': 'Value2', 'Name3': 'Value3', 'Name1': 'Value1'}
p_brian:2.30507516861
{'Name2': 'Value2', 'Name3': "'Value3'", 'Name1': 'Value1'}
p_kyle:7.22536420822
{'Name2': ['Value2'], 'Name3': ["'Value3'"], 'Name1': ['Value1']}
import timeit
import urlparse

s = "Name1=Value1;Name2=Value2;Name3='Value3'"

def p_easytiger_quoting(s):
    d = {}
    s = s.replace("'", "")
    for x in s.split(';'):
        k, v = x.split('=')
        d[k] = v
    return d


def p_brian(s):
    return dict(item.split("=") for item in s.split(";"))

def p_kyle(s):
    return urlparse.parse_qs(s)



print "p_easytiger_quoting:" + str(timeit.timeit(lambda: p_easytiger_quoting(s)))
print p_easytiger_quoting(s)


print "p_brian:" + str(timeit.timeit(lambda: p_brian(s)))
print p_brian(s)

print "p_kyle:" + str(timeit.timeit(lambda: p_kyle(s)))
print p_kyle(s)
easytiger
  • 514
  • 5
  • 15
  • This doesn't answer the question, because it doesn't handle quoting. Try `s = "Name1='Value1=2';Name2=Value2" and `csv` (as in Brian's accepted answer) or `parse_qs` (as in Kyle's) will get it right, while yours will raise a `ValueError`. The OP specifically says "such smallish solutions are usually mine-fields waiting to happen", which is why he wants a built-in or other well tested solution, and he gives an example that will break your code. – abarnert Mar 27 '13 at 00:05
  • Ahh i didn't see that. still. it would still be faster than all your solutions to preparse those in the main string before the iteration takes place and recalling the replace function thousands of times. I will update – easytiger Mar 27 '13 at 00:13
  • I'm not sure how you're going to preparse it. But even if you do, this seems like exactly what the OP was afraid of in a simple solution. Are you sure there are no other mines ahead? Can you prove it to the OP's satisfaction? – abarnert Mar 27 '13 at 00:18
  • OK, now that I've seen your edit… First, `s.replace` doesn't do anything at all; it just returns a new string that you ignore. Second, even if you got it right (`s = s.replace…`), that doesn't fix the problem, it just adds a new one on top of it. Try it on either my example or the OP's. – abarnert Mar 27 '13 at 00:21
  • The specification clearly includes handling the sample input he mentioned in his question, `Name='Value1=2';`. And your code doesn't handle it. And I'm not sure how you'd sanitize that without parsing it in some way that will be just as slow as `urlparse` or `csv` in the first place. – abarnert Mar 27 '13 at 00:24
  • Your new attempt still doesn't fix the problem. Do this: `s = "Name1='Value1=2;';Name2=Value2;Name3='Value3'"`. Having an `=` or `;` inside the quotes is critical, because that's the whole point of quoting. – abarnert Mar 27 '13 at 00:29
  • Sorry i was trying to fix it from my phone. I've added a new updated. Also please look at the output of both of your functions they are all wrong. Brian, yours INCLUDES the quotes, in his specification he removes them from the map so the value is a string without quotes. And Kyle's puts each element in a map with a value as a list. abamert, I'm afraid that is incorrect it will be faster to swap them out at the start – easytiger Mar 27 '13 at 00:30
  • ahh i see what you mean... he didnt provide that in his example output. `"Name1='Value1=2;'` in that case if i had to handle that a regex split would work well. However this is prob largely an issue of user input sanitisation. (i know bad answer etc). – easytiger Mar 27 '13 at 00:34
  • I don't think this is a sanitisation issue; I think he really does need quoting. Since the question you're answering is 5 years old, it may not be easy to find out… but it's certainly reasonable. Formats very much like this are used in URL-encoded forms, config files, CSV, etc., and they all either have some kind of quoting or some kind of escaping instead. – abarnert Mar 27 '13 at 00:37
  • Also, you can trivially fix the other two answers. For Brian's… actually, it _doesn't_ include the quotes; it's already correct. But if it did, you'd just `….strip("'")` on the output. For Kyle's, do `{k:v[0] for k, v in …}`. But you can't trivially fix your original answer (or Brian's original one), because without a parser, handling quoting is hard. Anyway, the fact that it took this much effort just to _see_ the problem, much less solve it, should demonstrate what the OP meant by "minefield". – abarnert Mar 27 '13 at 00:39
-2

IF your Value1, Value2 are just placeholders for actual values, you can also use the dict() function in combination with eval().

>>> s= "Name1=1;Name2=2;Name3='string'"
>>> print eval('dict('+s.replace(';',',')+')')
{'Name2: 2, 'Name3': 'string', 'Name1': 1}

This is beacuse the dict() function understand the syntax dict(Name1=1, Name2=2,Name3='string'). Spaces in the string (e.g. after each semicolon) are ignored. But note the string values do require quoting.

Rabarberski
  • 23,854
  • 21
  • 74
  • 96
  • Thanks, upvote string.replace worked well. Don't know why I couldn't split. I did i = textcontrol.GetValue() on tc box, then o = i.split(';') but didn't output a string just complained about format, unlike replace. – Iancovici Jun 13 '13 at 18:18
  • 1
    `s.replace(';'`-based solution breaks if there is `;` inside a quoted value. [eval is evil](http://stackoverflow.com/a/9558001/4279) and it is unnecessary in this case. – jfs Dec 21 '14 at 21:05