0

I am looking for a hashing function that can be used for non cryptographic purposes in Java. The challenge is that most of the hashing functions return signed integer values (-,0,+) that cannot be used as identifier in every context (for example negative integers cannot be used in URLs). One solution to this problem is that I come up with is to use a 32 bit signed int and convert it to a 32 bit unsigned int and store it in a long. This works pretty well. However, 32 bit random information makes hash collisions too frequent in our setup. One way of solving this is to use a 64 bit hashing function (same SipHash works fine) and convert that signed integer to unsigned by shifting one to the right and having 0 in the MSB position. I was trying to achieve that with the Java >> operator but the results does not make sense.

//Using Guava
private final static HashFunction hashFunction = Hashing.sipHash24();

    private static int getRandomInt() {
        return hashFunction.newHasher().putLong(rnd.nextLong()).hash().asInt();
    }

    private static long getRandomLong(){
        return hashFunction.newHasher().putLong(rnd.nextLong()).hash().asLong();
    }

Bitshifting:

 System.out.println(Long.toBinaryString(-2147483648L >> 1));
 1111111111111111111111111111111111000000000000000000000000000000

What am I missing and how could I have a 62 bit unsigned integer hash value stored in a 64 bit int (long) in Java?

UPDATE1:

After doing some research I finally found a way to correctly display the effect of >>> on a Long value:

        System.out.println(
          String.format("%64s", Long.toBinaryString(-2147483648L))
            .replace(' ', '0'));
        System.out.println(
          String.format("%64s", Long.toBinaryString(-2147483648L >>> 1))
            .replace(' ', '0'));

        1111111111111111111111111111111110000000000000000000000000000000
        0111111111111111111111111111111111000000000000000000000000000000
Istvan
  • 7,500
  • 9
  • 59
  • 109
  • 1
    Why can't you use a negative value in a url? Also, try [`>>>`](http://stackoverflow.com/questions/2811319/difference-between-and) (unsigned right shift) instead of `>>` (signed right shift). – Elliott Frisch Mar 28 '17 at 21:56
  • @ElliottFrisch thanks! because /-123123/ looks awfully similar to /123123/ and there are routing based on this value that does not support it being negative. – Istvan Mar 28 '17 at 22:10

1 Answers1

2

a >> b

shifts a to the right by b bits. On the left it repeats the bit that was already there (sign extending!). Examples:

  • 101010 >> 1 = 110101
  • 010101 >> 1 = 001010

a >>> b

also shifts a to the right by b bits, but doesn't sign extend. It always adds in zeros on the left:

  • 101010 >>> 1 = 010101
  • 010101 >>> 1 = 001010
Vast Dan
  • 74
  • 4