1

I have a list of strings and each element of the list has several strings separated by colon. I am trying to convert each element into a dictionary. For example, one element in my list looks like this:

attributesList[0]
Out: 'Health Score: A, Happy Hour Specials: Yes, Vegan Options: Yes, Takes Reservations: Yes, Delivery: No, Take-out: Yes, Accepts Credit Cards: Yes, Good For: Brunch, Lunch, Dinner, Parking: Street, Bike Parking: Yes, Wheelchair Accessible: Yes, Good for Kids: No, Good for Groups: Yes, Ambience: Casual, Trendy, Classy, Noise Level: Average, Alcohol: Beer & Wine Only, Good For Happy Hour: Yes, Outdoor Seating: Yes, Wi-Fi: Free, Has TV: No, Waiter Service: Yes, Caters: No, Gender Neutral Restrooms: Yes'

Based on solutions in link 1 and link 2, I tried the following approaches:

attributesDict = dict(s.split(':') for s in attributesList)
attributesDict = dict(map(str.strip, s.split(':')) for s in attributesList)
attributesDict = dict(map(lambda s : s.split(':') for s in attributesList))

But I keep getting error messages shown below in each of the approaches:

ValueError: dictionary update sequence element #0 has length 24; 2 is required
ValueError: dictionary update sequence element #0 has length 24; 2 is required
TypeError: map() must have at least two arguments.

I looked at a solution here, but I am not clear how to fix the problem in my context. I am also a bit nervous about the presence of multiple items in my string after the colon as in the below case:

 Good For: Brunch, Lunch, Dinner,

Can I capture the three items after the colon as a value in a dictionary? How I can I achieve what I am trying to?

Edit: adding desired output below

attributesDict[0]
Out: {'Health Score': 'A', 'Happy Hour Specials': 'Yes', 'Vegan Options': 'Yes', 'Takes Reservations': 'Yes', 'Delivery': 'No', 'Take-out': 'Yes', 'Accepts Credit Cards': 'Yes', 'Good For': 'Brunch, Lunch, Dinner', 'Parking': 'Street', 'Bike Parking': 'Yes', 'Wheelchair Accessible': 'Yes', 'Good for Kids': 'No', 'Good for Groups': 'Yes', 'Ambience': 'Casual, Trendy, Classy', 'Noise Level': 'Average', 'Alcohol': 'Beer & Wine Only', 'Good For Happy Hour': 'Yes', 'Outdoor Seating': 'Yes', 'Wi-Fi': 'Free', 'Has TV': 'No', 'Waiter Service': 'Yes', 'Caters': 'No', 'Gender Neutral Restrooms': 'Yes'}
Rnovice
  • 333
  • 1
  • 5
  • 18
  • Use is `map(function, iterable)` so `map(lambda s : s.split(':') , attributesList)` – azro Mar 16 '20 at 19:45
  • @Rnovice Can you provide an example of the desired output? – Jay Mody Mar 16 '20 at 19:49
  • @azro, using your suggestion removes the TypeError in my post but results in the ValueError in the post. – Rnovice Mar 16 '20 at 19:54
  • What is really your list attributesList ? could share it ? with the expected output – azro Mar 16 '20 at 20:04
  • Included an expected output in my question as suggested. All the string I posted is the first element of the my list. The list has over 1000 elements like that. – Rnovice Mar 16 '20 at 20:08
  • The problem is : you write about splitting : , and it seems your first input has a lot of pairs with : , but then you say it's only one element, so we don't know if there is ONE pair per element from the list OR each element of the list should be splitted too.Finally, your expected output is NOT a dictionnary, a dict is `{key : value}`, your **:** is inside the string – azro Mar 16 '20 at 20:10
  • Sorry for the confusion. I showed only the first element in my list to keep things simple and less cluttered. Yes, there are many : pairs in a single element and each element has to be split into key:value pairs as in a dictionary. Sorry, the : should not be inside the string. I changed the expected output accordingly. Hope this clarifies. – Rnovice Mar 16 '20 at 21:39

2 Answers2

4

Provided you really want a list of dict as output, you could do something like:

def get_dict(l):
    result = {}
    for v in l.split(','):
        if ':' in v:
            key, value = v.split(':')
            result[key.strip()] = value.strip()
        else:
            result[key.strip()] += ', ' + v.strip()
    return result

[get_dict(s) for s in attributesList]

This can probably be written nicer, but due to the possible multipe values a dict comprehension would be too complicated.

MrBean Bremen
  • 14,916
  • 3
  • 26
  • 46
  • Your solution does what I want except for one problem. The : pairs which have multiple values in the list do not appear accurately in the dictionary. For instance, there were two items like Good For: Brunch, Lunch, Dinner and Ambience: Casual, Trendy, Classy in the element of the list I posted. But the output appears as ' Good For': ' Brunch, Brunch, Brunch' and ' Ambience': ' Casual, Casual, Casual' respectively. What could be the issue? I guess I also need to clear the whitespaces from the current output. – Rnovice Mar 16 '20 at 21:56
  • I figured out the issue. I could fix the problem by replacing the value.strip() in the else statement to v.strip(). Many thanks for this solution. I am accepting your answer. – Rnovice Mar 16 '20 at 23:25
  • A thanks - this was a typo, I corrected it in the answer. – MrBean Bremen Mar 17 '20 at 06:03
0

you can use regular expressions:

import re

def gen_key(s):
    yield from (e.group(1).strip() for e in re.finditer(r'([^,]+?):', l[0]))

def gen_values(s):
    yield from (e.group().strip(' ,') for e in re.finditer(r'(?<=[:^])(.+?)(?=[^,]*?:|$)', l[0]))

def gen(s):
    yield from zip(gen_key(s), gen_values(s))

dict(*map(gen, l))

output:

{'Health Score': 'A',
 'Happy Hour Specials': 'Yes',
 'Vegan Options': 'Yes',
 'Takes Reservations': 'Yes',
 'Delivery': 'No',
 'Take-out': 'Yes',
 'Accepts Credit Cards': 'Yes',
 'Good For': 'Brunch, Lunch, Dinner',
 'Parking': 'Street',
 'Bike Parking': 'Yes',
 'Wheelchair Accessible': 'Yes',
 'Good for Kids': 'No',
 'Good for Groups': 'Yes',
 'Ambience': 'Casual, Trendy, Classy',
 'Noise Level': 'Average',
 'Alcohol': 'Beer &amp; Wine Only',
 'Good For Happy Hour': 'Yes',
 'Outdoor Seating': 'Yes',
 'Wi-Fi': 'Free',
 'Has TV': 'No',
 'Waiter Service': 'Yes',
 'Caters': 'No',
 'Gender Neutral Restrooms': 'Yes'}
kederrac
  • 16,819
  • 6
  • 32
  • 55