1

I've a nested dictionary such as this:

from collections import defaultdict, OrderedDict
dict = defaultdict(lambda:OrderedDict())
dict[1]['ab'] = 2343
dict[1]['ac'] = 6867
dict[1]['ad'] = 2345
dict[2]['sa'] = 2355
dict[2]['sg'] = 4545
dict[2]['sf'] = 2445
dict[3]['df'] = 9988

I want the count of the values for each keys. The count of each item will be run through an algorithm to determine where the next value needs to be added/removed. Right now I've this:

count = {}
for k, v in dict.items():
    count[k] = len(v)

Scaling is important here as I'm dealing with very large databases. I've to have the count every time I do something with the dictionary and if I'm accessing it a million times, I'd have to create count each time. Is there a more efficient/pythonic way to do this? May be create a custom class similar to this that keeps a count for each item as and when it is created/removed?

Community
  • 1
  • 1
0x0
  • 2,915
  • 5
  • 40
  • 67
  • 1
    Note that you can just do `d = defaultdict(OrderedDict)`, and shouldn't call your own object `dict`. Given that `len(d)` is `O(1)`, why cache the lengths? – jonrsharpe Sep 08 '14 at 14:45
  • @jonrsharpe Good point. I was just thinking if there are ~10K items, then I'd have to do `O(10k)` operations each time which is why I was concerned. – 0x0 Sep 08 '14 at 14:50
  • It's not clear what you're using the count for, or why you recalculate the whole thing every time. You could update `count[k]` when you change `d[k]`, or just create the whole `sum(map(len, d.items()))` if you don't need it that often. – jonrsharpe Sep 08 '14 at 15:01
  • @jonrsharpe Added explanation for the need for count. Yes, I can keep a variable `count[k]` and update it every time but the issue is that I need a global/public variable and when the program gets large, updating the count every time at different places makes the code ugly/unreadable/error-prone. – 0x0 Sep 08 '14 at 15:12
  • 2
    In that case, a custom object (`dict` subclass, or just `dict`-like) that keeps track of these counts as you go along is probably the best bet. – jonrsharpe Sep 08 '14 at 15:14

1 Answers1

0

Create your own class the inherits the dict class and add the count array that you need and have the new add/remove functions update it accordingly. Add a function that given a key returns the count and that way you can get the updated count without any extra time wasted other than the time you already have to spend in adding/removing from the dictionary. There are examples of how to subclass and override the add/remove functions here

Community
  • 1
  • 1
tomer.z
  • 1,033
  • 1
  • 10
  • 25