1

I have created three dictionaries-dict1, dict2, and dict2. I want to update dict1 with dict2 first, and resulting dictionary with dict3. I am not sure why they are not adding up.

def wordcount_directory(directory):
    dict = {}
    filelist=[os.path.join(directory,f) for f in os.listdir(directory)]
    dicts=[wordcount_file(file) for file in filelist]
    dict1=dicts[0]
    dict2=dicts[1]
    dict3=dicts[2]
    for k,v in dict1.iteritems():
        if k in dict2.keys():
            dict1[k]+=1
        else:
            dict1[k]=v
    for k1,v1 in dict1.iteritems():
        if k1 in dict3.keys():
            dict1[k1]+=1
        else:
            dict1[k1]=v1
return dict1

print wordcount_directory("C:\\Users\\Phil2040\\Desktop\\Word_count")  
ronakg
  • 4,038
  • 21
  • 46
Alph
  • 391
  • 2
  • 7
  • 18
  • ou need to elaborate on `I am not sure why they are not adding up.`. Post some sample values for `dict1`, `dict2` and `dict3`. – ronakg Jan 28 '15 at 04:08

4 Answers4

4

Maybe I am not understanding you question right, but are you trying to add all the values from each of the dictionaries together into one final dictionary? If so:

dict1 = {'a': 1, 'b': 2, 'c': 3}
dict2 = {'b': 5, 'c': 1, 'd': 9}
dict3 = {'d': 1, 'e': 7}

def add_dict(to_dict, from_dict):
    for key, value in from_dict.iteritems():
        to_dict[key] = to_dict.get(key, 0) + value

result = dict(dict1)
add_dict(result, dict2)
add_dict(result, dict3)
print result

This yields: {'a': 1, 'c': 4, 'b': 7, 'e': 7, 'd': 10}

It would be really helpful to post what the expected outcome should be for your question.

EDIT:

For an arbitrary amount of dictionaries:

result = dict(dicts[0])
for dict_sum in dicts[1:]:
    add_dict(result, dict_sum)
print(result)

If you really want to fix the code from your original question in the format it is in:

  1. You are using dict1[k]+=1 when you should be performing dict1[k]+=dict2.get(k, 0).
  2. The introduction of get removes the need to check for its existence with an if statement.
  3. You need to iterate though dict2 and dict3 to introduce new keys from them into dict1
  4. (not really a problem, but worth mentioning) In the if statement to check if the key is in the dictionary, it is recommended to simply the operation to if k in dict2: (see this post for more details)

With the amazing built-in library found by @DisplacedAussie, the answer can be simplified even further:

from collections import Counter

print(Counter(dict1) + Counter(dict2) + Counter(dict3))

The result yields: Counter({'d': 10, 'b': 7, 'e': 7, 'c': 4, 'a': 1})

The Counter object is a sub-class of dict, so it can be used in the same way as a standard dict.

Community
  • 1
  • 1
jakebird451
  • 2,288
  • 4
  • 30
  • 45
  • Yes. That is correct. How can I include this function to my original function (wordcount_directory)? – Alph Jan 28 '15 at 04:13
  • 2
    It's even easier with collections.Counter: from collections import Counter Counter(dict1) + Counter(dict2) + Counter(dict3) – DisplacedAussie Jan 28 '15 at 04:16
  • @Phil2014 Place the function I made in your file, and the last four lines in your existing function (replacing print with return). In my answer I made a copy of `dict1` into a new variable `result` so `dict1` was not modified. If you do not mind `dict` being altered, you can apply all the operations directly to `dict1`. – jakebird451 Jan 28 '15 at 04:25
  • @DisplacedAussie That is hilarious. I am assured every day that Python has a library for *everything*. Thanks for sharing! That is a neat library. – jakebird451 Jan 28 '15 at 04:25
  • @PM2Ring No, I really like the library. If find it funny because it would substitute the function I made and make this solution into one line. – jakebird451 Jan 28 '15 at 04:47
  • @jakebird451 Why mine was not working? My idea was updating dict1 with dict2 first and resulting dict1 with dict3. – Alph Jan 28 '15 at 04:49
  • 1
    While I prefer the .get(...,0) approach to my use of sets ... and I did upvote this answer; I think that having the function modify the *second* argument is counter-intuitive. Conventionally if a function mutates one of its arguments then those should be the leftmost in the parameter list. This matches the intuitions of many programmers regarding "L-values" and "R-values" ... in other words it looks more like an assignment. – Jim Dennis Jan 28 '15 at 04:52
  • @JimDennis Yea, I had that vibe as I was writing the answer. I guess I went with the unix ordering of from, to. But this isn't unix land. I'll change it to make more sense. Thanks for the input. – jakebird451 Jan 28 '15 at 04:58
  • @Phil2014 I added a couple of notes in my answer to why your original code did not work. – jakebird451 Jan 28 '15 at 05:12
  • @ Jakebird. I hard coded dict1, dict2, dict3. If have a lot of dictionaries, how can I automate it? – Alph Jan 28 '15 at 05:20
  • This is great. I really appreciate it. – Alph Jan 28 '15 at 05:42
3

Hmmm, here a simple function that might help:

def dictsum(dict1, dict2):
   '''Modify dict1 to accumulate new sums from dict2
   '''
   k1 = set(dict1.keys())
   k2 = set(dict2.keys())
   for i in k1 & k2:
       dict1[i] += dict2[i]
   for i in k2 - k1:
       dict1[i] = dict2[i]
   return None

... for the intersection update each by adding the second value to the existing one; then for the difference add those key/value pairs.

With that defined you'd simple call:

dictsum(dict1, dict2)
dictsum(dict1, dict3)

... and be happy.

(I will note that functions modify the contents of dictionaries in this fashion are not all that common. I'm returning None explicitly to follow the convention established by the list.sort() method ... functions which modify the contents of a container, in Python, do not normally return copies of the container).

Jim Dennis
  • 17,054
  • 13
  • 68
  • 116
  • Nice work, Jim. I probably wouldn't do the explicit `return None`, but it's certainly worth mentioning that functions that do in-place modifications of containers conventionally return `None` in Python. – PM 2Ring Jan 28 '15 at 04:38
  • Yes, it would be implicitly the same ... that's really just a hint to those reading and maintaining the code. – Jim Dennis Jan 28 '15 at 04:48
  • I'll confess that my use of sets here is slightly gratuitous. .iteritems() with .get(..., 0) is more space efficient. However, I think it's a useful example to remind readers that set operations (intersection and difference) are frequently useful with dictionary keys. – Jim Dennis Jan 28 '15 at 04:54
  • Good point; `.get(..., 0)` is certainly better than extracting the key list and building a set from it, both space-wise & time-wise. OTOH, it'd be interesting to compare timings, since the set intersection & difference operations are rather efficient, IIRC, so the above algorithm might end up being faster than a `get`-based one. – PM 2Ring Jan 28 '15 at 05:05
2

If I understand your question correctly, you are iterating on the wrong dictionary. You want to iterate over dict2 and update dict1 with matching keys or add non-matching keys to dict1.

If so, here's how you need to update the for loops:

for k,v in dict2.iteritems():     # Iterate over dict2
    if k in dict1.keys():         
        dict1[k]+=1               # Update dict1 for matching keys 
    else:
        dict1[k]=v                # Add non-matching keys to dict1
for k1,v1 in dict3.iteritems():   # Iterate over dict3
    if k1 in dict1.keys():
        dict1[k1]+=1              # Update dict1 for matching keys
    else:
        dict1[k1]=v1              # Add non-matching keys to dict1
ronakg
  • 4,038
  • 21
  • 46
2

I assume that wordcount_file(file) returns a dict of the words found in file, with each key being a word and the associated value being the count for that word. If so, your updating algorithm is wrong. You should do something like this:

keys1 = dict1.keys()
for k,v in dict2.iteritems():
    if k in keys1:
        dict1[k] += v
    else:
        dict1[k] = v

If there's a lot of data in these dicts you can make the key lookup faster by storing the keys in a set:

keys1 = set(dict1.keys())

You should probably put that code into a function, so you don't need to duplicate the code when you want to update dict1 with the data in dict3.

You should take a look at collections.Counter, a subclass of dict that supports counting; using Counters would simplify this task considerably. But if this is an assignment (or you're using Python 2.6 or older) you may not be able to use Counters.

PM 2Ring
  • 54,345
  • 6
  • 82
  • 182
  • @ PM 2Ring. Yes.dicts is list of three dictionaries and I subsetted it into three. I thought it should work. – Alph Jan 28 '15 at 04:05
  • @Phil2014: Yes, that part of your code is fine.; it's just your updating code that's wrong. Firstly, your `for` loops are iterating over the wrong `dict`, so in the first loop it will not pick up any keys that are in `dict2` if they aren't also present in `dict1`. Secondly, the `dict1[k]+=1` means that you're not adding the value from `dict2` to the value in `dict1`: you're only adding 1 to the count in `dict1`, no matter what the count is in `dict2` for that key. – PM 2Ring Jan 28 '15 at 04:35