Python Split dynamic string to key value pairs

Question

i need some help.
I have dynamic string like this:

S31
S4
S2M1L10XL8
S1M2L0XL0
S0M5L6XL8

and need change it to key value like:

{"S":31}
{"S":4}
{"S":2, "M":1, "L":10, "XL":8}
{"S":1, "M":2, "L":0, "XL":0}
{"S":0, "M":5, "L":6, "XL":8}

I try with

new_string = re.findall('(\d+|[A-Za-z]+)', string)

but can't find hoe to solve

Your re.findall gets you most of the way there actually. Now just iterate pairs ([see here](https://stackoverflow.com/questions/5434891/iterate-a-list-as-pair-current-next-in-python)) and make the dict from that. — wim, Aug 27 '21 at 22:06
`([A-Z]+)(\d+)` is likely the regex you want to start with. That would return something like `[('S', '0'), ('M', '5'), ('L', '6'), ('XL', '8')]` which you can then convert into a dictionary (and parse strings into numbers). — DemiPixel, Aug 27 '21 at 22:06

L8Cod3r · Accepted Answer · 2021-08-28T02:22:50.083

Try this:

dict(re.findall('(\D+)(\d+)',your_string))

>>> s = "S2M1L10XL8"
>>> re.findall('(\D+)(\d+)',s)
[('S', '2'), ('M', '1'), ('L', '10'), ('XL', '8')]

1st Capturing Group (\D+) \D matches any character that's not a digit (equivalent to [^0-9])

matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)

2nd Capturing Group (\d+) \d matches a digit (equivalent to [0-9])

matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)

https://regex101.com/r/hww2rm/1

Doesn't convert the number strings. – Kelly Bundy Aug 27 '21 at 22:14 — Kelly Bundy, Aug 27 '21 at 22:14

Barmar · Answer 2 · 2021-08-27T22:35:00.550

1

The regexp should match letters followed by numbers, not letters or numbers. Put them each in a separate capture group. You can then iterate over that and use a dictionary comprehension to

new_dict = {name.upper(): int(num) for name, num in re.findall(r'([A-Z]+)(\d+)', string, flags=re.I)}

edited Aug 27 '21 at 22:35

answered Aug 27 '21 at 22:08

Barmar

741,623
53
500
612

Ugh, I had an earlier version that did that when using finditer. The simplification lost that. – Barmar Aug 27 '21 at 22:16
@KellyBundy Updated to do the conversion – Barmar Aug 27 '21 at 22:19
I tried `finditer` as well, but sadly it only produces pesky Match objects. Btw see my answer why I disagree with your first sentence :-P – Kelly Bundy Aug 27 '21 at 22:24
Yeah, my original had `m.group(1).upper()` and `int(m.group(2))` in a dictionary comprehension. – Barmar Aug 27 '21 at 22:31
Those could be shortened thanks to [`__getitem__`](https://docs.python.org/3/library/re.html#re.Match.__getitem__). – Kelly Bundy Aug 27 '21 at 22:33
I've been silly. If I don't just use the `findall()` result directly, there's no reason to call `dict()`. I put back the dictionary comprehension. – Barmar Aug 27 '21 at 22:35

score -1 · Answer 3 · answered Aug 27 '21 at 22:22

-1

Your regex usage is alright, here's one way to make it work:

it = iter(re.findall('(\d+|[A-Za-z]+)', string))
result = dict(zip(it, map(int, it)))

answered Aug 27 '21 at 22:22

Kelly Bundy

23,480
7
29
65

Python Split dynamic string to key value pairs

3 Answers3