1

I'm trying to figure out how I can create a sorted representation of a dictionary with values (which are tuples), using with custom comparator on the values in Python 3, in a generalised way.

I have read these topics, but I'm still struggling:

Sort a Python dictionary by value

How to use a custom comparison function in Python 3?

As a specific example, one could consider the problem I'm trying to solve as, "get a list of products sorted by total cost, given a dictionary that contains the products (the key) a customer has in their checkout, along with the number and cost of each product (stored as a 2-tuple). In python 2, one could use something like this:

checkout_dict = {'Apples': (1, 3), 'Oranges': (3, 3), 'Grapes': (7, 1),
                 'Cheese': (10, 1), 'Crackers': (4, 4)}


from operator import itemgetter


def sort_dict(dict, comparison_func):
    return sorted(dict.iteritems(), key=itemgetter(1),
                  cmp=comparison_func)


def cmp_total_cost(product_data_1, product_data_2):
    total_product_cost_1 = (product_data_1[0]) * (product_data_1[0])
    total_product_cost_2 = (product_data_2[0]) * (product_data_2[0])
    return total_product_cost_2 - total_product_cost_1


print sort_dict(checkout_dict, cmp_total_cost)

The expected output would look something like this:

[('Crackers', (4, 4)), ('Cheese', (10, 1)), ('Oranges', (3, 3)),
 ('Grapes', (7, 1)), ('Apples', (1, 3))]

However in Python 3, the cmp parameter for sorted was deprecated, and instead we need to include the behaviour as part of the key parameter.

I understand that we need to use something like like the cmp_to_key function from the functools module, but I can't wrap my head around how I can keep everything generalised. I'm confused about how the itemgetter(1) can be combined with the cmp_to_key function and a custom comparison function.

Also, I understand that with the above example I could easily just loop over the dictionary first, and calculate the total costs, then do the sort, but I'm looking for a general solution I can apply for many different types of comparisons.

Note

I'd also like this to be as performant as possible. I found some info that using operator.itemgetter can really help speed things up: Sorting Dictionaries by Value in Python (improved?)

Jinglesting
  • 509
  • 5
  • 16

1 Answers1

1

If you simply want to get a list of tuples, ordered by the first element times the second element, this will do:

sorted(checkout_dict.items(), key=lambda item: item[1][0] * item[1][1])

#  [('Apples', (1, 3)), ('Grapes', (7, 1)), ('Oranges', (3, 3)), ('Cheese', (10, 1)),
#   ('Crackers', (4, 4))]

# or in the other way around
sorted(checkout_dict.items(), key=lambda item: item[1][0] * item[1][1], reverse=True)
# [('Crackers', (4, 4)), ('Cheese', (10, 1)), ('Oranges', (3, 3)), ('Grapes', (7, 1)), 
#  ('Apples', (1, 3))]
DeepSpace
  • 78,697
  • 11
  • 109
  • 154
  • Thanks DeepSpace, I'm looking for a general way to do this for any cmp function that uses the values of the elements of the tuples, where I can just specify it as a parameter. Perhaps the example I gave was a bit too specific. This seems really easy in 2.x, but not sure in 3! – Jinglesting May 14 '17 at 15:11
  • @Jinglesting I'm not sure I'm seeing the issue here to be honest. Instead of returning -1,0, 1 (for <, =, >) from `cmp` function you simply return the *actual* value to be compared from the `key` function. Python then figures out how to compare that returned value with other values. – DeepSpace May 14 '17 at 15:13
  • Also, the problem with the lambda function is all the overhead from the function calls (this is a problem in the 2.7 example I gave too). "operator.itemgetter" helps avoid these, but I'm not sure I can take advantage of this with my requirements. Could your solution be written in terms of "itemgetter"? – Jinglesting May 14 '17 at 15:14
  • _simply return the actual value to be compared from the key function._ I don't think it's always that easy. Sometimes I will need to compare values from separate entries like this `(item2_val[0] * item1_val[1]) - (item1_val[0] * item2_val[1])` where items_val is a tuple value of an entry in the dict – Jinglesting May 14 '17 at 15:20