1

What i am trying to do is take a text file and return a dictionary of the anagrams (words that make a new word when rearranged alphabetically) in the file. So if the file contained the words dub and bud, then the code should create a key bdu and attach the strings dub and bud to them in a tuple or list or what have you.

Now my code outputs exactly what i want, except that instead of one key with multiple values, im getting identical keys for every value. To draw back to my previous example, i get the key bdu for dub, then another bdu key for bud. How would i remove identical keys and merge key values to one key?

def anagrams(f):
    '''takes a file and returns a list of anagrams in the file'''
    wordget = open(f).read().lower().split()
    dic = {}
    for w in wordget:
        if ("".join(sortword(w))) in wordget:
            dic = {("".join(sortword(w))):w}
            for key in dic.keys():
                print "'%s': %s" % (key, dic[key])
    return None

Any help would be appreciated. I hope to come up with a solution that runs quick too, even with files containing tens of thousands of words (like books)

K DawG
  • 13,287
  • 9
  • 35
  • 66
Alex
  • 11
  • 2
  • possible duplicate of [How do I merge dictionaries together in Python?](http://stackoverflow.com/questions/2799064/how-do-i-merge-dictionaries-together-in-python) – Yohann Oct 16 '13 at 04:08

2 Answers2

1

Python's defaultdict type in the collections package is useful for this kind of thing.

from collections import defaultdict
from pprint import pprint
answer = defaultdict(list)
for word in open(filename).read().lower().split():
    answer[''.join(sorted(word))].append(word)
pprint(answer)

The defaultdict initialization accepts a function which initializes an object. In this case we initialize an empty list which we can immediately append.

You may also find the pprint module useful. It'll nicely format your lists of words.

GrantJ
  • 8,162
  • 3
  • 52
  • 46
  • Glad I could help, Alex. And welcome to StackOverflow. PS. Remember to [accept answers](http://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work) by clicking the green check mark to the upper left of the answer. That's a big part of the community. – GrantJ Oct 16 '13 at 13:33
0

This

dic = {("".join(sortword(w))):w}

is replacing dic with a new dictionary each time. You should be inserting keys or appending to lists instead

for w in wordget:
    key = ''.join(sorted(word))
    if key in dic:
        dic[key].append(word)
    else:
        dic[key] = [word]
 return dic

The if/else block can be tidied up using defaultdict as in GrantJ's answer

John La Rooy
  • 295,403
  • 53
  • 369
  • 502