I would like to know if there is a multi-language library or something that permits to give me the following result:
- I have a String = "Abcde12345" in Java
- We will suppose its hashcode in Java is "78911"
- I Have a String = "Abcde12345" in a C program
What i'd like to know is: how can i easily get the hashcode 78911 in my C program? Since each language can provide its own hash algorithm for a String, how can i handle that?
I'm asking this in the context of using Distributed Hash Tables (datagrids, distributed caches, NoSQL...). I'm planning to create something similar to a very simple client in C for a Java proprietary datagrid.
This is my usecase for now, but for my project, i will need a hash algorithm compatible with multiple languages: - Java hash algorithm in Ruby - C# hash algorithm in Java - C++ hash algorithm in Java - Java hash algorithm in C++ - Java hash algorithm in Erlang In any case, the hash of both algorithms in both languages will need to produce the exact same hash value.
And if possible, i'd like to extend the concept to primitive types and "simple structures" and not just for String
Does anyone know any tool to handle my usecase?
Edit: for Jim Balter
My usecase is:
I have a proprietary partitioning/datagrid technology called GemFire, written in Java. It acts as a distributed hashmap. The number of buckets in the hashmap is fixed. For each map key, it computes its hashcode, and apply a modulo, so that it knows for each key to each bucket it belongs to.
For exemple, if i have 113 bucket (which is the default number of buckets in gemfire), and my map key is the String "Key"
"Key".hashCode() % 113 = 69
Thus GemFire knows "Key" belongs to the 69nth bucket.
Now i have a C application:
- This application is already aware of the number of buckets used by Gemfire (113).
- This application needs to be able to compute, for any random key, the bucket number in which GemFire would put that random key.
- This application needs to be able to compute it fastly, we can't use a webservice.
- This application should be easy to deploy, and i don't any bridge technology between C/Java - that would require a JVM to be installed to run the C application
So if you know how to do that without having to write/use a Java hashcode port in C, please tell me.
Edit: to avoid confusion: i'm not looking for a anything else, but Jim Balter you suggested i do not need what i claim to need so tell me if you see any other solution, except using like you said a custom or popular hash algorithm.
And in the future i may need to do the same for an Erlang partitionning application with a C# client application, and other languages!
Edit: I would like to avoid using a non-java hash algo (as someone suggested using md5/sha1 or any faster non-security-oriented hash algo). This is because my solution aims to be deployed on legacy distributed systems oftenly written in Java, which already contain a lot of data, and any change in the hash algorithm would require a heavy migration process of the data. However i keep this solution in mind since it could be a sweet second option for people starting a new distributed system from scratch or ready to do their data migration.
So in the end, what i am looking for is not some people to tell me to implement the Java String hash algorithm in C, i already know i can do that thanks! I want to know if someone already did it, and not only for implementing all primitive java algorithms in C, but also in other languages, and from other languages!!! I'm looking for a multi-languages library that provides for each other language, a port of the hash algorithms.
Thus if there would be only 3 languages in earth (C, Java and Python), my question is: is there any polyglot library that provides:
- A port of Java hash in C
- A port of Java hash in Python
- A port of C hash in Java
- A port of C hash in Python
- A port of Python hash in Java
- A port of Python hash in C
For all primitive types available, and eventually basic structures. If for a given language there is no "default hash algorithm" then the most widely used can be considered as the language algorithm.
You see what i mean? I want to know if there is a LIBRARY! i know i can look in the JDK or specification and implement it on my own, but as i'm targeting a large number of languages and i don't know how to code in every languages, i'd like someone to have did it for me and made available in an opensource, free to use project!