4

Can someone please confirm, and explain, why this happens:

On simulator (7.1, 32-bit):

NSNumber *a = [NSNumber numberWithFloat:0.5]; // hash = 506952114
NSNumber *b = [NSNumber numberWithFloat:1.0]; // hash = 2654435761
NSNumber *c = [NSNumber numberWithFloat:2.0]; // hash = 1013904226

On device (7.1, 32-bit):

NSNumber *a = [NSNumber numberWithFloat:0.5]; // hash = 2654435761
NSNumber *b = [NSNumber numberWithFloat:1.0]; // hash = 2654435761 - SAME!
NSNumber *c = [NSNumber numberWithFloat:2.0]; // hash = 5308871522

I thought it might be a 32-bit issue, but when I try the same thing on 64-bit simulator and device, I get the SAME issue. Simulator is fine, device has identical hashes.

I was trying to add unique objects to an NSMutableOrderedSet and noticed that my two objects that were identical except for differing values of 0.5 and 1.0 were not both being added, and this is why. I tried both floats and doubles with the same result.

But why?

jowie
  • 8,028
  • 8
  • 55
  • 94
  • 1
    Do you understand the concept of `hash`? – Sulthan Jun 18 '14 at 11:22
  • Well to the best of my knowledge, it is an unsigned integer that (I assume) provides a unique reference to an object with a specific value... But if I'm missing something please let me know. – jowie Jun 18 '14 at 13:30
  • 1
    Yes you are missing something, `hash` doesn't guarantee uniqueness. It's like a ZIP code. Lots of people will share the same ZIP code but ZIP codes are still a big help when you are searching for somebody. `hash` have no sense without if you don't have a good `isEqual` too. – Sulthan Jun 18 '14 at 13:44
  • wikipedia: [Hash function](http://en.wikipedia.org/wiki/Hash_function) – zaph Jun 18 '14 at 14:51
  • Yeah I realise that now. I was just surprised that a hash of NSNumber only returns a value based on its unsignedInteger. And I also wasn't expecting the behaviour on device to be different to the simulator. – jowie Jun 18 '14 at 14:56

2 Answers2

2

I think this excellent article from Mike Ash might give some insight:

For floats that are integer values, we want to do the same thing. Since our isEqual: considers an integer-valued DOUBLE equal to an INT or UINT of the same value, we must return the same hash as the INT and UINT equivalent. To accomplish this, we check to see if the DOUBLE value is actually an integer, and return the integer value if so:

    if(_value.d == floor(_value.d))
        return [self unsignedIntegerValue];

(I won't quote the whole section about hash, so please read the article for full disclosure).

But, bottom line, it looks like using [NSNumber hash] is a bad idea as a key in an associative array/hash table. However I cannot explain why it behaves differently under the Simulator and device; that looks somewhat troubling...

trojanfoe
  • 120,358
  • 21
  • 212
  • 242
  • Thanks... But how can I write my object's hash with `NSNumber` in mind so the two objects are unique? – jowie Jun 18 '14 at 10:34
  • Also, I am not concerned with `isEqual:` as this actually brings back the correct result. It's the hash that doesn't work. I also noticed that hashes of `NSNumber`s between 0.5 and 0.9 *all* return the same value. – jowie Jun 18 '14 at 10:37
  • @jowie But only on the device? On the simulator it looks like it works? – trojanfoe Jun 18 '14 at 10:37
  • yes - only on device. Unfortunately this is the way round it matters! – jowie Jun 18 '14 at 10:44
  • 1
    @jowie Yeah that is worrying. Apple recommend using string values for keys so you could try that: https://developer.apple.com/library/ios/documentation/cocoa/conceptual/ProgrammingWithObjectiveC/FoundationTypesandCollections/FoundationTypesandCollections.html – trojanfoe Jun 18 '14 at 10:48
  • Also, I've moved from overriding `isEqual:` to compare hashes to doing a proper compare of the object's properties: http://stackoverflow.com/questions/254281/best-practices-for-overriding-isequal-and-hash - now I understand the difference. – jowie Jun 19 '14 at 08:34
2

There is no guarantee that a hash for different inputs is different.

In this case consider that there are 2^32 hash values and there are magnitudes more unique NSSNumbers so the hash can not be used for uniqueness.

A rather short hash is generally used as a fast initial comparison and then if it matches with a full compare of the object. This is probably what NSNumber isEqual does.

That is why using a hash as a key in a NSSet is a bad idea and for the reasons @trojanfoe quoted from Mike Ash an NSNumber hash will not work.

Even cryptographic hashes such as SHA512 are not guaranteed to produce different results for different inputs but the chance is small as the hash length increases. This is why MD5 is recommended against and even SHA2 is increasingly being considered to short.

zaph
  • 111,848
  • 21
  • 189
  • 228