2

I know similar questions have been asked before but I cannot find an adequate answer to my situation.

Let's say I have the following list of tuples:

d = [('first', 1), ('second', 2), ('third', 3)]

I can convert this very easily into a dictionary:

dict(d)
# {'first': 1, 'second': 2, 'third': 3}

Now, if I instead have the following list of tuples:

d = [('a', 'first', 1), ('a', 'second', 2), ('b', 'third', 3)]

How can I, most efficiently, get the following nested dictionary:

{'a': {'first': 1, 'second': 2}, 'b': {'third': 3}}

Here's the solution I have now:

from collections import defaultdict
dd = defaultdict(dict)
for a, b, c in d:
    dd[a][b] = c
# defaultdict(dict, {'a': {'first': 1, 'second': 2}, 'b': {'third': 3}})

Is this the most performant way of doing this? Is it possible to avoid the for loop? It's likely that I have to deal with cases where d is very large, and this method may not scale very well. This part is critical to a web application I am building, and that's why performance is very important.

Input/feedback/help is appreciated!

besi
  • 171
  • 1
  • 9
  • Your can avoid explicit `for` loop, but some form of loping is unavoidable. – PM 77-1 Oct 07 '22 at 15:36
  • I believe you are correct about that. If I had to choose, however, then I'd trust python more than myself to do the looping. – besi Oct 07 '22 at 15:39

1 Answers1

3

You can do with groupby

from itertools import groupby
result = {k:dict([(i[1],i[2]) for i in l]) for k, l in groupby(d, key=lambda x: x[0])}

# Result
{'a': {'first': 1, 'second': 2}, 'b': {'third': 3}}

When you use groupby the values are expected to sort with key.

Edit:

You can avoid the second looping by using map.

{k:dict(map(lambda x: (x[1],x[2]), l)) for k, l in groupby(d, key=lambda x: x[0])}
Rahul K P
  • 15,740
  • 4
  • 35
  • 52
  • Excellent! I had not considered `groupby`. I'll give it a try. – besi Oct 07 '22 at 15:46
  • Looking at your solution, you are using two for loops instead of one. Granted, one is only over three items, but it is done `len(d)` times. The performance is quite comparable to the `defaultdict` solution, however, if not a little bit slower. I'm guessing `groupby` will be more memory efficient, correct? That, too, is an important consideration. – besi Oct 07 '22 at 15:59
  • Just for the record: https://stackoverflow.com/questions/773/how-do-i-use-itertools-groupby – PM 77-1 Oct 07 '22 at 16:03