Compute hash code of a string for bucketing with uniform distribution

Asked Sep 21 '16 at 20:52

Active Sep 21 '16 at 20:55

Viewed 286 times

I have a large dataset that won't fit into a single Cassandra DB partition. I'd like to place the data into different buckets with uniform distribution. I was planning to use .NET's GetHashCode() method, but it does not guarantee compatibility across .NET versions. Is there are any hashing/bucketing algorithms that I can use to uniformly distribute data in a partition?

edited Sep 21 '16 at 20:55

GregC

7,737
2
53
67

asked Sep 21 '16 at 20:52

user3281496

Have you already read this: http://stackoverflow.com/a/21115750/90475 – GregC Sep 21 '16 at 20:59
Possible duplicate of [What is the best algorithm for an overridden System.Object.GetHashCode?](http://stackoverflow.com/questions/263400/what-is-the-best-algorithm-for-an-overridden-system-object-gethashcode) – GregC Sep 21 '16 at 21:00
This is an odd question - Cassandra is going to has your partition key using murmur3 or md5 anyway. Why can't you just use some meaningful part of the data as the partition key so that you're able to query it later? – Jeff Jirsa Sep 22 '16 at 05:43

Compute hash code of a string for bucketing with uniform distribution

0 Answers0