6

I have a number of objects which I need to link to an integer number. These objects are ArcGIS Point objects (exactly what they are isn't relevant), which store an X and a Y value for a point, as floating point numbers.

I need to record that, for example:

Point(X = 2.765, Y = 3.982) -> 2
Point(X = 33.9, Y = 98.45) -> 7
Point(X = 1.23, Y = 2.43) -> 9
Point(X = 8.342, Y = 6.754) -> 5

I then need to be able to look up the resulting value by the X and Y values. I've tried using the Point objects as the dictionary's key, but this doesn't work as when I recreate the point object from the X and Y values it doesn't look it up properly anymore (presumably because the object ID has changed).

How should I go about linking these point values to the integers. Is there another way that I can use a dictionary?

mskfisher
  • 3,291
  • 4
  • 35
  • 48
robintw
  • 27,571
  • 51
  • 138
  • 205

5 Answers5

10

Add a hash method to your Point class:

...
def __hash__(self):
    return hash(self.x) ^ hash(self.y)
...

In other words, the hash of a point is a munging of the hash of the x and y coordinates.

EDIT: a better hash function (based on comments here) is:

...
def __hash__(self):
    return hash((self.x, self.y))
...

Because Python hashes tuples in a way that hash((p,q)) does not equal hash((q,p)), this will avoid hash collisions for points symmetric about the diagonal.

Then, you can use your Point object as keys for dictionaries, put them in sets, etc.

payne
  • 13,833
  • 5
  • 42
  • 49
  • That sounds good, but the problem is that I didn't write the Point class - it comes from the ArcGIS API. Is there a way to 'insert'/'override' methods to insert that kind of hash code to the method? – robintw Jan 21 '11 at 20:36
  • 1
    @robintw There is: Point.__hash__ = lambda self: return hash(self.x) ^ hash(self.y) – Josh Bleecher Snyder Jan 21 '11 at 20:46
  • @robintw: Also, you could create your own class subclassed from Point: `class HashPoint(Point): [class definition...]` which might be more maintainable. – senderle Jan 21 '11 at 22:10
  • @payne: This doesn't seem to work. I ran the command you gave to add the __hash__ method to the Point class, but when I use these Point objects as keys in my dictionary I get KeyErrors. What I do is: create a point (p1) with X = 1 and Y = 2. Add this to the dictionary with `d[p1] = 42`. I then try and select from the dictionary with a new point object (`p2`) created with exactly the same X and Y parameters, and I get a KeyError. Is there something else I need to do? – robintw Jan 21 '11 at 22:16
  • 1
    @robintw: first make sure your hashes are working right with Python's built-in hash() function. Does hash(p1) == hash(p2)? Also, make sure that your Point class has an __eq__() method to test for equality: what's p1.__eq__(p2) give you? – payne Jan 21 '11 at 22:26
  • The `hash()` function shows that the hashes are working fine, but the problem is that the Point class doesn't have a `__eq__()` method. It does have an equals method (as in, `p1.equals(p2)`). I tried to add an `__eq__()` method using the `lambda` things above but I couldn't get it to work. I guess I need to get an argument into the lambda function. – robintw Jan 22 '11 at 18:18
  • 1
    I would advise against using `hash(x) ^ hash(y)` as it is equivalent to `hash(y) ^ hash(x)`, meaning two symmetrical points would get the same hash, bad idea. (Meaning `hash(Point(1,2)) == hash(Point(2,1)`, which you probably don't want in a dictionary) – noio Feb 14 '11 at 22:13
  • I also advise against this, and not only for symmetrical points, but for points in the same quadrant! 3^4 == 9^2 – Kyle Wild Feb 15 '11 at 00:08
4

Python dictionary keys need to be immutable types.

You could use a tuple like (2.765, 3.982). As long as a tuple contains only immutable types, it can be used as a dictionary key.

Here's my test in the console:

>>> my_dict[(12.3151, 1.2541)] = "test"
>>> my_dict[(12.3151, 1.2541)]
'test'

You could come up with a simple string convention like "2.765, 3.982" to turn a point into an index, but that would be a waste of processing. Also, a word of caution: If for some reason you choose to do this, you must use repr instead of str (here's a Stack Overflow post on that topic).

Community
  • 1
  • 1
Kyle Wild
  • 8,845
  • 2
  • 36
  • 36
  • Thanks. Will this cause problems with the floats though, as they can't be stored exactly. I just worry that converting floats to strings and using them to index will lead to problems with floats that don't convert nicely to strings (and floats that can't even be stored exactly in memory). – robintw Jan 21 '11 at 20:51
  • @robintw if you need floating point accuracy, check out the decimal module http://docs.python.org/library/decimal.html – admalledd Jan 22 '11 at 11:19
  • @robintw - I haven't tested it, but I believe @admalledd is correct. Can you select an answer, or share how you solved the problem? – Kyle Wild Feb 15 '11 at 00:07
  • 2
    @dorkitude, @admalledd, @robintw: No loss of accuracy happens by just using the floats in a tuple: `dict_key = (point.x, point.y)`. Converting them to string or decimal just for use in a hash function is a waste of CPU time (and a loss of accuracy if you use `str()` instead of `repr()`). – John Machin Feb 21 '11 at 02:33
  • @John Machin - good to know :) I will leave my answer as-is, then! – Kyle Wild Feb 21 '11 at 07:56
  • @Dorkitude: Please don't leave it as is. It mentions "simple string convention" first, when it should be mentioned last, if at all, and only then to warn that (a) you need to use `repr()` not `str()` and (b) it's a waste of CPU time. – John Machin Feb 21 '11 at 09:48
1

You need to override __hash__ in Point.

novalis
  • 1,112
  • 6
  • 12
  • 1
    This is problem is a bit trickier than that, because __hash__ has to return an integer. You'd need either need unique integer IDs for each Point (meaning the code and machine overhead of maintaining a registry) or an integer transformation that can apply to `(float, float)` and generate unique integers -- and I'm at a loss for coming up with one of those off the top of my head. – Kyle Wild Jan 21 '11 at 20:14
  • 2
    @dorkitude: you can just do hash(some_float) to convert a float to a unique integer -- use Python's built-in. – Josh Bleecher Snyder Jan 21 '11 at 20:47
  • @josh that's cool! but the problem remains -- we're trying to transform two floats into one unique integer. – Kyle Wild Jan 21 '11 at 23:47
  • 1
    @dorkitude: It's enough to `return hash((self.x, self.y))` in `__hash__`. – Rosh Oxymoron Jan 21 '11 at 23:58
1

Use a namedtuple: http://docs.python.org/dev/library/collections.html#collections.namedtuple

Josh Bleecher Snyder
  • 8,262
  • 3
  • 35
  • 37
1

Add a __hash__() method to the point class like Payne said. Or manually compute a hash for each point. In any case, use something like what Python would natively do:

...
def __hash__(self):
    return hash( (self.x, self.y ) )
...
noio
  • 5,744
  • 7
  • 44
  • 61