3

I have a dictionary structure that maps an id (integer) into a number (double). The numbers are actually weights of an item.

I am writing a function that will allows me to fetch the id of a given weight (if the weight is found in the dict, else, it will return the id of the next closest (i.e. nearest matching) weight.

This is what I have so far:

def getBucketIdByValue(bucketed_items_dict, value):
    sorted_keys = sorted(bucketed_items_dict.keys())
    threshold = abs(bucketed_items_dict[sorted_keys[-2]] -bucketed_items_dict[sorted_keys[-1]]) # determine gap size between numbers

    # create a small dict containing likely candidates
    temp = dict([(x - value),x] for x in bucketed_items_dict.values() if abs(x - value) <= threshold)
    print 'DEBUG: Deviations list: ', temp.keys()
    smallest_deviation = min(temp.keys()) if value >= 0 else max(temp.keys()) # Not sure about this ?
    smallest_deviation_key = temp[smallest_deviation]
    print 'DEBUG: found bucketed item key:',smallest_deviation_key
    return smallest_deviation_key

I'm not sure the logic is actually correct (esp. where I obtain the smallest deviatioon). In any event, if even the logic is correct, this seems an overly complicated way of doing things. Is there a more elegant/pythonic way of doing this?

Off the top of my head, I think a more pythonic/elegant way would be to do something like passing a custom function to the min function - don't know if that is possible...

[[Update]]

I am running Python 2.6.5

Homunculus Reticulli
  • 65,167
  • 81
  • 216
  • 341
  • Reading through this... and trying to get my head around what you're trying to do... but .keys() is redundant on a sorted(), as sorted(dict) is only on the keys... – Jon Clements Jul 01 '12 at 17:40
  • Related: [Custom dictionary lookup in Python](http://stackoverflow.com/questions/5808970/custom-dictionary-lookup-in-python) – Sven Marnach Jul 01 '12 at 17:52

4 Answers4

4

Try sorting the items by the distance of their weight to your target value:

from operator import itemgetter
distances = ((k, abs(v - value)) for k, v in bucketed_items_dict.items())
return min(distances, key=itemgetter(1))[0]

Or using a lambda function instead of itemgetter:

distances = ((k, abs(v - value)) for k, v in bucketed_items_dict.items())
return min(distances, key=lambda x:x[1])[0]
Mark Byers
  • 811,555
  • 193
  • 1,581
  • 1,452
  • Could you please explain the last line?. I am not sure I understand the `itemgetter` and why you pass 1 to that function. – Homunculus Reticulli Jul 01 '12 at 17:44
  • @HomunculusReticulli: If `t = ('a','b')` then `t[1] == 'b'`. Also `itemgetter(1)(t) == 'b'`. Those two do the same thing. I've posted an update that uses a lambda function instead of itemgetter, in case you are more familiar with lambda functions. – Mark Byers Jul 01 '12 at 17:45
3
def getBucketIdByValue(bucket, value):
    distances = [( id , abs( number - value ) ) for id , number in bucket.items()]
    swapped = [( distance , id ) for id , distance in distances]
    minimum = min ( swapped )
    return minimum[1]

Or in short:

def getBucketIdByValue(bucket, value):
    return min((abs(number-value),id) for id,number in bucket.items())[1]

This function uses the bucket to create id/number pairs, then creates an iterator of distance/id pairs, then gets the first minimum pair of it and finally extract the id of that pair and returns it.

The distance is defined as the absolute value of the difference between the number and the sought-for value.

The minimum is defined as the pair with the lowest distance. If there are more, the pair with the lowest id is returned.

Marco de Wit
  • 2,686
  • 18
  • 22
  • +1 Actually I like this idea because by switching the key and value you don't need to specify a key for the `min` function. But when I originally saw this answer I thought it was the same as mine except badly formatted so that it doesn't fit on the screen. You could improve your answer by formatting it so there is no scroll bar. It would also be better if there were some explanation about how it works rather than just dumping code and hoping the reader can figure out what is going on. – Mark Byers Jul 01 '12 at 18:18
2

You can find the index of closest weight using bisect in sorted keys:

import bisect

def bisect_weight(sorted_keys, value):
    index = bisect.bisect(sorted_keys, value)
    # edge cases
    if index == 0: return sorted_keys[0]
    if index == len(sorted_keys): return sorted_keys[index - 1]
    minor_weight = sorted_keys[index - 1]
    greater_weight = sorted_keys[index]

    return minor_weight if abs(minor_weight - value) < abs(greater_weight - value) else greater_weight

This way you just need to check 2 weights and find the best one. Sorting and binary searching are probably faster than calc all weights and find the best one.

iurisilvio
  • 4,868
  • 1
  • 30
  • 36
1

I'd also consider the bisect module.

Jon Clements
  • 138,671
  • 33
  • 247
  • 280
  • And how would that help on something that is intrinsically unordered? – Sven Marnach Jul 01 '12 at 17:53
  • @SvenMarnach well, you don't use it on the dict... the ordering needs to be forced anyway (some kind of sort == list), so not an unreasonable suggestion. – Jon Clements Jul 01 '12 at 17:56
  • Order and bisect is probably faster than iterate in your (maybe) big list, so it is a good option. My answer uses bisect module. – iurisilvio Jul 01 '12 at 18:01
  • @JonClements: No, not an unreasonable suggestion. I just think it's a bit too succinct as an answer and would need a wee bit more detail. – Sven Marnach Jul 01 '12 at 18:06
  • 1
    @SvenMarnach Not much point now as a bisect option has been given and another accepted, so.. but yes agreed - I should have put more information into the post. – Jon Clements Jul 01 '12 at 18:11