3

Here is a specific example:

my_dict={k:int(encoded_value) 
         for (k,encoded_value) in 
             [encoded_key_value.split('=') for encoded_key_value in 
              many_encoded_key_values.split(',')]}

The question is about the internal list [], can it be avoided, e.g.:

# This will not parse
my_dict={k:int(encoded_value) 
         for (k,encoded_value) in 
             encoded_key_value.split('=') for encoded_key_value in 
             many_encoded_key_values.split(',')}

..., which is invalid syntax:

NameError: name 'encoded_key_value' is not defined

Sample data: aa=1,bb=2,cc=3,dd=4,ee=-5

Michael Goldshteyn
  • 71,784
  • 24
  • 131
  • 181
  • for this particular example maybe even `literal_eval` from `ast` could be helpful with some text manipulations. – Ma0 Aug 10 '17 at 15:13
  • @Ev.Kounis, I've also tried `result = ast.literal_eval('dict('+many_encoded_key_values+')')`. But I'm curious, it doesn't work: `... raise ValueError('malformed node or string: ' + repr(node))` – RomanPerekhrest Aug 10 '17 at 15:35
  • 1
    @RomanPerekhrest I tried `res = ast.literal_eval('{"' + many_encoded_key_values.replace('=', '":').replace(',', ',"') + '}')` and it did but it looked too ugly to post. – Ma0 Aug 10 '17 at 15:36
  • whoa, some serious case specific parsing hacks, there :) – Michael Goldshteyn Aug 10 '17 at 15:37
  • @MichaelGoldshteyn The pain in the neck was quoting the `abc`s. But @Roman has a very valid point.. Why doesn't his `literal_eval` work?. – Ma0 Aug 10 '17 at 15:38

4 Answers4

5

As was mentioned, generator expression will enhance your approach avoiding creating inner list. But there is a shorter way to obtain the needed result, using re.findall() function:

result = {k:int(v) for k,v in re.findall(r'(\w+)=([^,]+)', many_encoded_key_values)}
print(result)

The output:

{'dd': 4, 'aa': 1, 'bb': 2, 'ee': -5, 'cc': 3}

The alternative approach would be using re.finditer() function which returns 'callable_iterator' instance:

result = {m.group(1):int(m.group(2)) for m in re.finditer(r'(\w+)=([^,]+)', many_encoded_key_values)}
RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105
3

you could avoid creating an intermediate list by using an intermediate generator expression:

my_dict={k:int(encoded_value)
         for (k,encoded_value) in
             (encoded_key_value.split('=') for encoded_key_value in
              many_encoded_key_values.split(','))}

syntax-wise this is almost the same; instead of generating an intermediate list first and then using the elements, the elements are consumed on the fly.


making this overly verbose you could use a 'data pipeline' that consist of generators:

eq_statements = (item.strip() for item in many_encoded_key_values.split(','))
var_i = (var_i.split('=') for var_i in eq_statements)
my_dict = {var: int(i) for var, i in var_i}
print(my_dict)

(unfortunately .split does not return a generator so considering saving space this is not of much use... for handling large files things like this may come in handy.)

found this answer which has split as an iterator. just in case...

hiro protagonist
  • 44,693
  • 14
  • 86
  • 111
1

FWIW, here's a functional approach:

def convert(s):
    k, v = s.split('=')
    return k, int(v)

d = dict(map(convert, data.split(',')))
print(d)

output

{'aa': '1', 'bb': '2', 'cc': '3', 'dd': '4', 'ee': '-5'}
PM 2Ring
  • 54,345
  • 6
  • 82
  • 182
0

a simple and compact variant that is very close to your original attempt:

d = {v.strip(): int(i) for s in data.split(',') for v, i in (s.split('='),)}

the only additional 'trick' was to wrap s.split('=') inside a tuple (surrounding it with parentheses: (s.split('='),)) in order to get both elements of split in the same for iteration. the rest is straightforward.

hiro protagonist
  • 44,693
  • 14
  • 86
  • 111
  • ...sorry for the additional answer(s). but i felt like this should be possible in a simpler way that what i presented first. this feels much more natural to me. – hiro protagonist Aug 10 '17 at 18:45