Sorting list based on frequency if letters - Huffman Algorithm

Question

I don't know how to order a list of letters and their frequencies in ascending order i.e {'z':1, 'g':3, 'a':5, and so on}

I'm trying to recreate the Huffman Algorithm, a lossless compression algorithm, in Python. txt is a string of text, that was been split up so each letter, including spaces, is an individual index. I have tried using Counter(txt), which finds how many times each letter appears in txt and creates a dictionary. But this orders the dictionary from highest frequency to lowest frequency, and I need it to be vice-versa, so that it follows the steps of Huffman Algorithm. I then tried adding

for key, value in sorted(freq.iteritems(), key=lambda(k,v): (v,k)):
    print("%s: %s" % (key, value))

However this creates a syntax error, and I dont know if this is the best way to do it.

Here is my code:

from collections import Counter
def huffman(file):
    txt = list(map(lambda c2: c2, file)) # Places each individual char into array.
    freq=Counter(txt) #Counts numb of times a letter appears.
    print(freq)
    for key, value in sorted(freq.iteritems(), key=lambda(k,v): (v,k)):
        print("%s: %s" % (key, value))

I just need the freq dictionary to be order from least common to most common so that it follows the steps of Huffman's algorithm. So instead of {'a':5, 'g':3, 'z':1} it is {'z':1, 'g':3, 'a':5}

python dictionaries do not maintain an internal 'order'. The result you are getting is only a coincidence, in truth it is not ordered at all. You can create an `OrderedDict` from `collections` which will remember the order that items are inserted into the dict. So you can make sure to add them in the order youc are about. — Woody Pride, Feb 05 '19 at 18:17
@WoodyPride The ordering of dicts are dependant on the Python version used. Since Python 3.7 they are ordered by insertion order as per the documentation — SimonF, Feb 05 '19 at 18:20

taurus05 · Accepted Answer · 2019-02-05T18:37:38.377

2

On python version 3.6 or below, use this:

from collections import OrderedDict freq = OrderedDict(sorted(freq.items(), key=lambda x: x[1]))

Beginning from python version 3.7, you can use this:
freq = dict(sorted(freq.items(), key=lambda x: x[1]))

Dictonary beginning from version 3.7 and above dictonary is ordered by default. The fist element of each tuple is the alphabet and the second element is it's frequency. So, in the sorted function, we use the frequency of each element as the key to sort the elements in increasing order.

edited Feb 05 '19 at 18:37

answered Feb 05 '19 at 18:16

taurus05

2,491
15
28

1

Hm. Wouldn't this turn it into a list like : `[('z', 1), ('g', 3), ('a', 5)]`? The OP wants a dictionary I believe. – LeKhan9 Feb 05 '19 at 18:20
1

Unfortunately, that's still not right :) Dict will just convert it into a regular dictionary without order. You almost have it though! – LeKhan9 Feb 05 '19 at 18:23
I think, the order of the elements inside the list, should be maintained.. OrderedDict might help. – taurus05 Feb 05 '19 at 18:25
1

@LeKhan9 That ordering depends on the Python version used, Python3.7 keeps insertion order – SimonF Feb 05 '19 at 18:25
Note that ordering by insertion is implemented in CPython for 3.6 but only guaranteed since Python 3.7. That means that other implementations of Python will not support it before version 3.7. Please add this to your answer – SimonF Feb 05 '19 at 18:36
1

Thank you @taurus05! I have stuck on this for hours trying to figure it out! Just a quick question though, what does the `lambda x: x[1]` part do? Just simply curious. – DavalliTV Feb 05 '19 at 18:51
If you know inline functions in c++, it is somewhat analogus to it. `lambdas` are anonymous functions, which are used to perform a set of tasks (which the fuctions can also accomplish), but they more pythonic. You can refer to this answer for more info. https://stackoverflow.com/questions/12264834/what-is-the-difference-for-python-between-lambda-and-regular-function – taurus05 Feb 05 '19 at 18:55

score 0 · Answer 2 · answered Feb 05 '19 at 18:26

0

If you really want an ordered dictionary back you have to jump through a couple of hoops :)

You'd want to sort that dictionary first to obtain a flat list:

import operator
a = {'a':5, 'g':3, 'z':1}
sorted_list = sorted(a.items(), key=operator.itemgetter(1))

and then, you'd pass that to an OrderedDict:

from collections import OrderedDict
ordered_dict = OrderedDict(sorted_list)

ordered_dict:

OrderedDict([('z', 1), ('g', 3), ('a', 5)])

then, you can index as such:

ordered_dict['z']

output:

answered Feb 05 '19 at 18:26

LeKhan9

1,300
1
5
15

1

Since Python3.7 dictionaries preserve insertion order – SimonF Feb 05 '19 at 18:27
Thank you for the quick response @LeKhan9! I ended up using the answer above, but I did try yours and it produced the same results. Thank you again! – DavalliTV Feb 05 '19 at 18:57

Sorting list based on frequency if letters - Huffman Algorithm

2 Answers2