56

I wanted to generate a unique hash code for a string in put in android. Is there any predefined library is there or we have to generate manually. Please any body if knows please present a link or a code stuff.

stacka
  • 9
  • 5
Shashikant
  • 571
  • 1
  • 4
  • 4
  • 1
    What about built-in hashCode for Strings? – Mikita Belahlazau May 25 '11 at 06:59
  • 3
    unique hash code? why? and how do you even think that's possible? – amal May 25 '11 at 07:01
  • 9
    Please elaborate. Unique hash codes are impossible (unless they can have an infinite length), since there is an infinity of possible strings. – JB Nizet May 25 '11 at 07:03
  • That is totally wrong. I can think of five ways to create a unique hash code just off the top of my head. It all starts with overriding the hashCode () function. Anyway, the last comment is a bit of an oversimplification: If a hash is created on a large enough domain with a robust generator, the chances of a collision can be extremely small. IFF you override hashCode to use a thread-safe incrementor, you can have unique values. Most of the time, though, when implemented correctly, it just *does not matter* probability-wise. – ingyhere May 03 '13 at 22:19
  • @ingyhere - Please show us how ... – Stephen C Jun 09 '15 at 02:45
  • @Stephen C - An earlier comment was surely deleted in the two years before your followup. More importantly, though, context is key: This isn't the Crypto board. You missed the part about "it does not matter probability wise" and in this context unique means 'effectively unique'. But think about it: There is an easy way to check for collisions in a self-contained system. – ingyhere Nov 12 '15 at 21:45
  • 3
    You have no idea what the OP's context is. Absolutely no idea. Assuming that when he says "unique" that he doesn't mean that is a huge stretch. Anyhow, the challenge remains: Show us how. – Stephen C Nov 12 '15 at 22:47

8 Answers8

65

It depends on what you mean:

  • As mentioned String.hashCode() gives you a 32 bit hash code.

  • If you want (say) a 64-bit hashcode you can easily implement it yourself.

  • If you want a cryptographic hash of a String, the Java crypto libraries include implementations of MD5, SHA-1 and so on. You'll typically need to turn the String into a byte array, and then feed that to the hash generator / digest generator. For example, see @Bryan Kemp's answer.

  • If you want a guaranteed unique hash code, you are out of luck. Hashes and hash codes are non-unique.

A Java String of length N has 65536 ^ N possible states, and requires an integer with 16 * N bits to represent all possible values. If you write a hash function that produces integer with a smaller range (e.g. less than 16 * N bits), you will eventually find cases where more than one String hashes to the same integer; i.e. the hash codes cannot be unique. This is called the Pigeonhole Principle, and there is a straight forward mathematical proof. (You can't fight math and win!)

But if "probably unique" with a very small chance of non-uniqueness is acceptable, then crypto hashes are a good answer. The math will tell you how big (i.e. how many bits) the hash has to be to achieve a given (low enough) probability of non-uniqueness.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
  • 64-bit hashcode: for completeness if you want a 64-bit one, from sfussenegger in http://stackoverflow.com/questions/1660501/what-is-a-good-64bit-hash-function-in-java-for-textual-strings – Antoni Nov 05 '15 at 11:33
  • So, a 32-bit hash can only uniquely identify a String with 2 characters? – ADTC Dec 08 '17 at 22:04
  • Basically ... yes. (Assuming character == arbitrary `char` value. It gets a bit more complicated if character means Unicode codepoint ... or (say) ASCII codepoint.) – Stephen C Dec 09 '17 at 00:09
38

This is a class I use to create Message Digest hashes

import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;

public class Sha1Hex {

    public String makeSHA1Hash(String input)
            throws NoSuchAlgorithmException, UnsupportedEncodingException
        {
            MessageDigest md = MessageDigest.getInstance("SHA1");
            md.reset();
            byte[] buffer = input.getBytes("UTF-8");
            md.update(buffer);
            byte[] digest = md.digest();

            String hexStr = "";
            for (int i = 0; i < digest.length; i++) {
                hexStr +=  Integer.toString( ( digest[i] & 0xff ) + 0x100, 16).substring( 1 );
            }
            return hexStr;
        }
}
itsadok
  • 28,822
  • 30
  • 126
  • 171
Bryan Kemp
  • 673
  • 6
  • 13
9
String input = "some input string";
int hashCode = input.hashCode();
System.out.println("input hash code = " + hashCode);
Boris Pavlović
  • 63,078
  • 28
  • 122
  • 148
  • 13
    @Vladimir - by definition no hash code is defined to be unique! Hashcode needs to be well distributed, uniqueness idea is a faulty understanding of the OP. – bestsss May 25 '11 at 07:50
  • 29
    if hash code was unique, that'd be one hell of a compression algorithm. – Jeffrey Blattman Mar 05 '13 at 18:05
  • 9
    e.g. try "Z@S.ME" and "Z@RN.E" they have the same hash values when using hashCode ;) – Simon Mar 18 '13 at 13:32
  • @Simon, just ran your example in .NET because I was curious. They must have different base hashing algorithms because they aren't exact matches there. https://dotnetfiddle.net/6YJRpV – ps2goat Jun 03 '15 at 16:47
  • Maybe what OP meant by unique is: unique-for-a-given-input-string (no two hashes should be generated for the same string). – Sanjay Verma May 23 '19 at 03:38
  • @Simon Oh wow they collides – TechWisdom Sep 27 '22 at 11:56
4

I use this i tested it as key from my EhCacheManager Memory map ....

Its cleaner i suppose

   /**
     * Return Hash256 of String value
     *
     * @param text
     * @return 
     */
    public static String getHash256(String text) {
        try {
            return org.apache.commons.codec.digest.DigestUtils.sha256Hex(text);
        } catch (Exception ex) {
            Logger.getLogger(HashUtil.class.getName()).log(Level.SEVERE, null, ex);
            return "";
        }
    }

am using maven but this is the jar commons-codec-1.9.jar

shareef
  • 9,255
  • 13
  • 58
  • 89
3

You can use this code for generating has code for a given string.

int hash = 7;
for (int i = 0; i < strlen; i++) {
    hash = hash*31 + charAt(i);
}
Manmohan Soni
  • 6,472
  • 2
  • 23
  • 29
3

For me it worked

   public static long getUniqueLongFromString (String value){
       return  UUID.nameUUIDFromBytes(value.getBytes()).getMostSignificantBits();
    }
Raluca Lucaci
  • 2,058
  • 3
  • 20
  • 37
2

A few line of java code.

public static void main(String args[]) throws Exception{
       String str="test string";
       MessageDigest messageDigest=MessageDigest.getInstance("MD5");
       messageDigest.update(str.getBytes(),0,str.length());
       System.out.println("MD5: "+new BigInteger(1,messageDigest.digest()).toString(16));
}
Durgpal Singh
  • 11,481
  • 4
  • 37
  • 49
0

Let's take a look at the stock hashCode() method:

public int hashCode() {
    int h = hash;
    if (h == 0 && count > 0) {
        for (int i = 0; i < count; i++) {
            h = 31 * h + charAt(i);
        }
        hash = h;
    }
    return h;
}

The block of code above comes from the java.lang.String class. As you can see it is a 32 bit hash code which fair enough if you are using it on a small scale of data. If you are looking for hash code with more than 32 bit, you might wanna checkout this link: http://www.javamex.com/tutorials/collections/strong_hash_code_implementation.shtml

ngaspama
  • 371
  • 4
  • 10