0

I need to come up with an algorithm for a hash function that take 20 bits long input and the output should be only a 5 bits long.

I have searched in the internet for the last week and I couldn't find anything helpful.

your help is very much appreciated. Thanks

Joe
  • 41,484
  • 20
  • 104
  • 125
John Stufy
  • 25
  • 5
  • 3
    Take the input stream 4 bits at a time. XOR each bit and output the result. – Dave Rager May 07 '13 at 17:08
  • 4
    Do you need *any* hash algorithm, or a *good* one? 5 bits is small, you will have dupes. You could do all kinds of things: take the first/last 5 bits. DO some computation on the bits. Always return "1". There are a number of existing hash functions you can use that would probably be better than those choices though. :) – Joe May 07 '13 at 17:09
  • depends heavily on how your input is distributed, but two obvious choices are 4-fold xor and mod-31 – Chris Dodd May 07 '13 at 17:09
  • Easy answer: copy the first 5 bits and discard the next 15 bits – Thomash May 07 '13 at 19:23

4 Answers4

2

A simple solution is to break the input into 4 blocks of 5 bits and XOR them.

Barmar
  • 741,623
  • 53
  • 500
  • 612
0

This calculation is equivalent to “break the input into 4 blocks of 5 bits and XOR them” as suggested by Barmar, but probably is a tad more efficient (where x is the input):

t = x ^ (x>>10);
result = (t ^ (t>>5)) & 31;

However, the XOR approach generally won't stir and mix the original 20 bits as much as does the equation mentioned by Crocker. On some machines, this approach will be faster than Crocker's, and vice versa on others.

James Waldby - jwpat7
  • 8,593
  • 2
  • 22
  • 37
0
((x * 1772 + 271828182) % 314159) & 31
Egor Skriptunoff
  • 23,359
  • 2
  • 34
  • 64
-1

Do you really need an :edit: original or new algorithm? (ie. this is a school assignment).

You could just use the standard library's random number generator and seed it your 20 bits. The algorithm it uses will be more than sufficient. What common algorithms are used for C's rand()?

Just pick one of the standard implementations and you should be fine. Below example is just living on the wild (non-portable side). Since the rand implementations are different depending on the compiler or the version of the library your using the hashes won't match up if your using them for something universal, but my bet is you just need an at runtime message checksum or it's a school assignment.

#include <time.h>
#include <stdlib.h>
...
int compress20To5(int input) {
    srand( input );
    return rand() & 0x1f; // mask last 5 bits 0x1f (11111b)
}
Community
  • 1
  • 1
Louis Ricci
  • 20,804
  • 5
  • 48
  • 62
  • well actually it does't mattter.The thing is i will use it in FPGA so i will be using VHDL so the C library is not very helpful :( – John Stufy May 07 '13 at 19:48
  • Well if you are doing this in hardware the XOR solution looks ideal. – Peter Webb May 08 '13 at 09:57
  • @Peter Webb - the straight XOR solution is far from ideal, it creates some very obvious collisions due to the symmetric property of XOR. 0xFF XOR 0x00 == 0x00 XOR 0xFF – Louis Ricci May 08 '13 at 12:27
  • @John Stufy - If you follow the link in my answer you'll see the related SO question that has the actual implementation (source code) of those referenced C library implementations, you can pick (copy & paste) one of those implementations into your own code. – Louis Ricci May 08 '13 at 12:44