1

This question is about nested dictionary comprehension and I have referred the link1 and link2 before asking this.
I have a list whose first element is None and the remaining of it is a list of sorted positive numbers.

sorted_ar = [None, 10, 10, 12, 12, 12, 15, 25] 

My requirement is to build a dictionary as:

key_dict = {10: [3, 2], 12: [12, 3], 15: [6, 1], 25: [7, 1]}

The values of the dictionary are a two element list, first element is the sum of indexes of the occurrences of key, second is the number of occurrences.
For example for element 12, sum of indexes = 3+4+5 = 12 and number of occurrences is 3.
The following code does it.

 key_dict = {k:[0,0] for k in sorted_ar if k!=None}
        for i in range(len(sorted_ar)):
            if sorted_ar[i]:
                key_dict[sorted_ar[i]][0] += i
                key_dict[sorted_ar[i]][1] += 1  

My requirement is to prepare the key_dict dictionary using dictionary comprehension.

My attempt:

key_dict = { 
    sorted_ar[i]:[ key_dict[sorted_ar[i]][0] + i,key_dict[sorted_ar[i]][0] + 1] 
    for i in range(1,len(sorted_ar)) if sorted_ar[i]!=None
}

But this is giving some erroneous result as

key_dict = {10: [2, 1], 12: [5, 1], 15: [6, 1], 25: [7, 1]} 

How should I write the dictionary comprehension in this case?

4 Answers4

1

If you are going to use sorting, then look at itertools.groupby() and the enumerate() function to add indices:

from itertools import groupby

filtered = ((i, v) for i, v in enumerate(sorted_ar) if v)
grouped = ((v, list(g)) for v, g in groupby(filtered, lambda iv: iv[1]))
result = {v: [sum(i for i, v in g), len(g)] for v, g in grouped}

You can put this all into a single expression if you so desire:

result = {v: [sum(i for i, v in g), len(g)] for v, g in (
    (v, list(g)) for v, g in groupby((
        (i, v) for i, v in enumerate(sorted_ar) if v), lambda iv: iv[1]))}

Demo:

>>> from itertools import groupby
>>> sorted_ar = [None, 10, 10, 12, 12, 12, 15, 25]
>>> filtered = ((i, v) for i, v in enumerate(sorted_ar) if v)
>>> grouped = ((v, list(g)) for v, g in groupby(filtered, lambda iv: iv[1]))
>>> {v: [sum(i for i, v in g), len(g)] for v, g in grouped}
{10: [3, 2], 12: [12, 3], 15: [6, 1], 25: [7, 1]}

or as one long expression:

>>> {v: [sum(i for i, v in g), len(g)] for v, g in ((v, list(g)) for v, g in groupby(((i, v) for i, v in enumerate(sorted_ar) if v), lambda iv: iv[1]))}
{10: [3, 2], 12: [12, 3], 15: [6, 1], 25: [7, 1]}

Your dictionary approach, on the other hand, does not require the input to be sorted, so can be run in O(N) time (sorting takes O(NlogN) time).

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
0

You can try this:

sorted_ar = [None, 10, 10, 12, 12, 12, 15, 25] 
new_data = {i:[sum(c for c, b in enumerate(sorted_ar) if b == i), sorted_ar.count(i)] for i in sorted_ar if i}

Output:

{25: [7, 1], 10: [3, 2], 12: [12, 3], 15: [6, 1]}
Ajax1234
  • 69,937
  • 8
  • 61
  • 102
  • Both the solutions are correct, however there is a reason I tried the way I tried. If you look at the for loop code that works ( 3rd snippet) , it looks simple/readable and has no comparison operations, the values are updated as and when a key is found. I am looking for such a solution that has no comparison and written with basic python akin to the for loop solution. – Animesh Mukherkjee Jan 08 '18 at 17:57
0

Ok I have found a way to achieve my behavior, not sure why, but the trick is that dictionary updates need to happen outside the comprehension.

def my_summer(i,num,key_dict):
    key_dict[num][0] = key_dict[num][0]+i
    return key_dict[num][0]  


def my_counter(num,key_dict):
    key_dict[num][1] +=1
    return key_dict[num][1]

sorted_ar = [None, 10, 10, 12, 12, 12, 15, 25] 
key_dict = {k:[0,0] for k in sorted_ar if k!=None}

key_dict = {sorted_ar[i]:[my_summer(i,sorted_ar[i],key_dict),my_counter(sorted_ar[i],key_dict)] for i in range(1,len(sorted_ar))}

Output: {10: [3, 2], 12: [12, 3], 15: [6, 1], 25: [7, 1]}

0

You can try something like this:

sorted_ar = [None,10, 10, 12, 12, 12, 15, 25]

track={}
for i,j in enumerate(sorted_ar):
    if j not in track:
        track[j]=[(i,1)]
    else:
        track[j].append((i,1))


final_={}
for i,j in track.items():
    if i not in final_:
        final_[i]=(sum(list(map(lambda x:x[0],j))),sum(list(map(lambda x:x[1],j))))

print(final_)

output:

{None: (0, 1), 10: (3, 2), 15: (6, 1), 12: (12, 3), 25: (7, 1)}