0

There are 10 fields in my database. In those 4 fields, 1 field is hash_attr. hash_attr is basically signifies whether incoming row is same or not with the row stored in database. If it is same we don't need to update the database else we need to update.

Like :

Fields : 1 ( id ) - 1

2 ( name ) - John

3 ( type ) - Coach

4 ( attr_hash ) - calculated by java hashcode

Hashcode logic : int code = (name+type).hashcode();

Idea of attr_hash is that for data we we will calculate the hashcode and if that hascode matches with attr_hash then I will not update the databse table because that row should be same.

I think two different String can have same hashcode according to this link if we are using object's hashcode method. So what should be my hashcode logic to ensure two different Strings cannot have same hashcode.

I hope question is clear.

Community
  • 1
  • 1
VJS
  • 2,891
  • 7
  • 38
  • 70
  • Hash code checks, as indicated below, aren't very useful for data of this size. Once you start stuffing 20MB image blobs in your database, though, keeping a 256-bit hash on hand makes a world of difference. – torquestomp Jul 17 '14 at 07:28

2 Answers2

2

You can't.

Proof:

  • There are 4,294,967,296 possible hash codes (because they are ints).
  • There are more than 4,294,967,296 possible strings. For example, there are 8,031,810,176 strings that contain 7 lowercase letters.
  • Therefore, there must be more than one string with the same hash code (by the pigeonhole principle).
user253751
  • 57,427
  • 7
  • 48
  • 90
0

hash_attr is basically signifies whether incoming row is same or not with the row stored in database. If it is same we don't need to update the database else we need to update.

Not quite. As the answer at that link said, different strings can have the same hash code. However, different hash codes must have been generated by different strings. You can use this to optimize your database access in some cases.

Specifically, if the hash codes differ, you know the strings differ, so the database needs to be updated. But if the hash codes are the same, the strings might still be different. In that case you need to call equals() on the strings to determine whether they really are different, and only then do the database update.

Community
  • 1
  • 1
Stuart Marks
  • 127,867
  • 37
  • 205
  • 259