0

I am using Boost:crc32 for computing checksum but collision is occurring. Could anyone please suggest alternate algorithm in Boost that would give unique checksum for a string

Karthik K M
  • 619
  • 2
  • 9
  • 24
  • It all depends on the strings you are using. We would need more specific info – bartop Sep 24 '19 at 13:03
  • The string passing as input is unique crc32 is giving same checksum for two different strings – Karthik K M Sep 24 '19 at 13:10
  • That's understandable. Why do you need these crc32 unique? – bartop Sep 24 '19 at 13:15
  • 1
    You can't guarantee that. The domain of checksums will be smaller than the domain of strings. You can't define an injective function from the latter to the former. So collisions are to be expected. – Laurent LA RIZZA Sep 24 '19 at 13:20
  • @bartop I want to distinguish different string . Rather than checking every string in loop i am computing checksum and comparing those – Karthik K M Sep 24 '19 at 13:24
  • try a different hash function like one of those: https://stackoverflow.com/a/57960443/2119377 but collisions cannot be avoided in general. – Wolfgang Brehm Sep 24 '19 at 13:41

1 Answers1

1

There is no fixed size checksum for a string that is also unique, because strings are not of fixed size. The hash-space is smaller than the string-space.

That being said, try boost::hash or std::hash, they probably use murmur2/3 which probably has a collision rate as low as statistically possible.

See also string hash functions

Wolfgang Brehm
  • 1,491
  • 18
  • 21