split comma-separated key-value pairs with commas

Question

a bit like this question: How to split comma-separated key-value pairs with quoted commas

But my question is:

line='name=zhg,code=#123,"text=hello,boy"'

Note, "text=hello,boy", NOT:text="hello,boy"

I'd like to separate the line to dict. The output I want is:

"name":"zhg","code":"#123","text":"hello,boy"

How to get it using regex or shlex?

@AvinashRaj: I don't think he does. If he did, his link would answer his question. — zondo, Mar 12 '16 at 13:31

score 0 · Answer 1 · answered Mar 12 '16 at 14:13

You can't do that with the regex or it won't be the most efficient. The code to parse such string is straightforward using a single pass parser:

line='name=zhg,code=#123,"text=hello,boy"'


def read_quote(string):
    out = ''
    for index, char in enumerate(string):
        if char == '"':
            index += 2  # skip quote and comma if any
            return index, out
        else:
            out += char


def read(string):
    print('input', string)
    out = ''
    for index, char in enumerate(string):
        if char == ',':
            index += 1  # skip comma
            return index, out
        else:
            out += char
    # end of string
    return index, out

def components(string):
    index = 0
    while index < len(line):
        if string[index] == '"':
            inc, out = read_quote(string[index+1:])
            index += inc
            yield out
        else:
            inc, out = read(string[index:])
            index += inc
            yield out

print(dict([e.split('=') for e in components(line)]))

It prints the following:

{'text': 'hello,boy', 'code': '#123', 'name': 'zhg'}

You can implement read and read_quote using a regex if you really want to.

The code doesn't do validation and doesn't do error checking. — amirouche, Mar 12 '16 at 14:15

score 0 · Accepted Answer · answered Mar 12 '16 at 14:40

0

You can use csv.reader with a suitably "file-like" string.

>>> import csv
>>> import StringIO
>>> line='name=zhg,code=#123,"text=hello,boy"'
>>> string_file = StringIO.StringIO(line)
>>> for row in csv.reader(string_file):
...  print row
...
['name=zhg', 'code=#123', 'text=hello,boy']

answered Mar 12 '16 at 14:40

chepner

497,756
71
530
681

Thank you! then I can use str.split('=') to get what I want. – zhenghuagui Mar 13 '16 at 14:13

split comma-separated key-value pairs with commas

2 Answers2