1

I need to generate unique hash for objects to identify objects with the same exact attributes.

After reading a bit on the subject I decided its best to use MD5 hashing rather than java hashCode (I have a large number of objects, around 200,000-300,000).

I found many examples on how to do MD5 hashing on a string, but didn't find how to do it on an object, so that it will really be unique according to the object's attributes.

Mark Rotteveel
  • 100,966
  • 191
  • 140
  • 197
Ofek Agmon
  • 5,040
  • 14
  • 57
  • 101
  • 1
    Have a look at [the JavaDocs of `MessageDigest`](https://docs.oracle.com/javase/7/docs/api/java/security/MessageDigest.html) and google for some examples. – deHaar Jul 22 '19 at 06:37
  • 1
    I meant I want two objects with the same attributes to have the same hashes/string identifiers. Maybe it wasn't clear - this is so I can detect if I got a similar object and skip it – Ofek Agmon Jul 22 '19 at 06:40
  • 1
    You're right, that wasn't clear to me. Do you want to detect *similar* or *equal* objects considering their attributes? I think `equals()` could be an option for the latter... – deHaar Jul 22 '19 at 06:43
  • 1
    I get 200,000 objects to process. for each one I process I generate a hash and store it in a cache, so that the next time I get an object with the same attributes I'll know to skip it - so I need equal objects (in terms of attributes) – Ofek Agmon Jul 22 '19 at 06:45
  • perhaps override the `hashCode()` method for that object? see also: https://docs.oracle.com/javase/7/docs/api/java/lang/Object.html#hashCode() https://stackoverflow.com/a/113600 https://stackoverflow.com/a/27609 – riyaz-ali Jul 22 '19 at 06:53
  • If you insist on using MD5, you could apply it to the `hashCode` values of your object's instance variables but - could you share here what advantage you see with MD5 ? – David Soroko Jul 22 '19 at 07:18
  • 1
    Java has no formal definition of "attributes" of an object, so that concept only has meaning to you. You need a method to order these "attributes" so that two objects with the same "attributes" can produce them in the same order. Then you need to convert these attributes into byte arrays and supply them to the hash function. This can seemingly be done in the standard way by overriding `hashCode` and `equals`, and I can see no value in using MD5. A standard `Set` or `Map` should suffice for your needs, based on the limited info you've provided. – President James K. Polk Jul 22 '19 at 13:46

1 Answers1

1

Hash code of an object does not need to be unique. By the way it is impossible

Hash code must comply to the following contract:

  • It should return the same value every time it is invoked on the same object in the same state.
  • If two objects are equal according to Object::equals then the hashCode should return the same.

Most of IDEs can generate the method hashCode() but there are some tools even in JDK for generating hashcode eg. java.util.Objects.hash(Object...).

You can read a great summary here

Jónás Balázs
  • 781
  • 10
  • 24