9

Is there any way that I can use a hashcode of a string in java, and recreate that string?

e.g. something like this:

String myNewstring = StringUtils.createFromHashCode("Hello World".hashCode());
if (!myNewstring.equals("Hello World"))
    System.out.println("Hmm, something went wrong: " + myNewstring);

I say this, because I must turn a string into an integer value, and reconstruct that string from that integer value.

Richard J. Ross III
  • 55,009
  • 24
  • 135
  • 201

6 Answers6

6

This is impossible. The hash code for String is lossy; many String values will result in the same hash code. An integer has 32 bit positions and each position has two values. There's no way to map even just the 32-character strings (for instance) (each character having lots of possibilities) into 32 bits without collisions. They just won't fit.

If you want to use arbitrary precision arithmetic (say, BigInteger), then you can just take each character as an integer and concatenate them all together. Voilà.

Ted Hopp
  • 232,168
  • 48
  • 399
  • 521
  • Then is there another method I could use? Some type of encoding? – Richard J. Ross III Jun 13 '11 at 18:33
  • @Richard, Base 10 is the encoding you are using. – Vineet Reynolds Jun 13 '11 at 18:35
  • @Richard What you need is encryption/decryption. – Marcelo Jun 13 '11 at 18:35
  • @Marcelo, any suggestions on what encryption to use? – Richard J. Ross III Jun 13 '11 at 18:36
  • what he needs is data compression/decompression without data loss. – Hovercraft Full Of Eels Jun 13 '11 at 18:38
  • @Richard I found an example here: http://www.java-tips.org/java-se-tips/javax.crypto/encryption-and-decryption-using-symmetric.html – Marcelo Jun 13 '11 at 18:38
  • @Marcelo, that just gives me a byte array, if I wanted a simple byte array I would covert it directly to that, but I need a single integer... – Richard J. Ross III Jun 13 '11 at 18:42
  • 3
    I can't believe how senseless this idea of using encryption is. How on earth can anyone recover a possibly infinitely long message from a cipher text that is bounded to 32-bits? – Vineet Reynolds Jun 13 '11 at 18:44
  • Yes, n bytes to 4 bytes and back is impossible. Encrypt to a string http://www.exampledepot.com/egs/javax.crypto/DesString.html or compress http://java.sun.com/developer/technicalArticles/Programming/compression/ – Marcelo Jun 13 '11 at 18:46
  • @Marcelo, you can throw out as many links as you like. But none of them will allow you to **accurately map a single 32-bit integer** to a sequence of characters of unlimited length. – Vineet Reynolds Jun 13 '11 at 18:49
  • @Marcelo, if I have to pick holes in your theory, which compression scheme will allow an encrypted message to be mapped to **exactly 32 bits** so that on decryption the original message can be obtained? – Vineet Reynolds Jun 13 '11 at 18:53
  • @Vineet I **know** you can't compress n bytes to 4 and back, I am just trying to give @Richard some alternatives. If I tought this was an **answer**, I would have posted it as so. – Marcelo Jun 13 '11 at 19:41
4

No. Multiple Strings can have the same hash code. In theory you could create all the Strings that have have that hash code, but it would be near infinite.

JustinKSU
  • 4,875
  • 2
  • 29
  • 51
2

Impossible I'm afraid. Think about it, a hashcode is a long value i.e. 8 bytes. A string maybe less than this but also could be much longer, you cannot squeeze a longer string into 8 bytes without losing something.

The Java hashcode algorithm sums every 8th byte if I remember correctly so you'd lose 7 out of 8 bytes. If your strings are all very short then you could encode them as an int or a long without losing anything.

Andrew
  • 993
  • 9
  • 11
1

For example, "1019744689" and "123926772" both have a hashcode of -1727003481. This proves that for any integer, you might get a different result (i.e. reversehashcode(hashcode(string)) != string).

hyper-neutrino
  • 5,272
  • 2
  • 29
  • 50
0

Let's assume the string consists only of letters, digits and punctuation, so there are about 70 possible characters.

log_70{2^32} = 5.22...

This means for any given integer you will find a 5- or 6-character string with this as its hash code. So, retrieving "Hello World": impossible; but "Hello" might work if you're lucky.

Cephalopod
  • 14,632
  • 7
  • 51
  • 70
  • Apparently, with very few exceptions, all English words map to a distinct number under Java's hashCode operation. So if you have the hash of a single word, you can step through a dictionary checking the hash codes, and have a very high chance of recovering the original word. – Jules Dec 11 '13 at 16:02
  • A very interesting find indeed. Alas, the question implies that an arbitrary string, or at least a sequence of multiple words, is to be stored. – Cephalopod Dec 12 '13 at 14:01
0

You could do something like this:

char[] chars = "String here".toCharArray();
int[] ints = new int[chars.length];
for (int i = 0; i < chars.length; i++) {
    ints[i] = (int)chars[i];
}

Then:

char[] chars = new char[ints.length]
for (int i = 0; i < chars.length; i++) {
    chars[i] = (char)ints[i];
}
String final = new String(chars);

I have not actually tested this yet... It is just "concept" code.