1

What I need

I need an algorithm that produces a bijective output. I have a 31-bit input and need a pseudo-random 31-bit output.

What I have considered

CRCs are bijective within their bit-width.

I have looked on Google and can find the polynomials for this, but not the tables or algorithm.

Could anyone point me in the right direction?

I need a CRC-31 algorithm using polynomial say 0x737e312b, or any bijective function that will do what I need.

NOTE

I found the following code, but I unfortunately don't have the tools to compile and run it.

IamIC
  • 17,747
  • 20
  • 91
  • 154
  • Since the output is the same range as the input, you could use the input as the output – David Zimmerman May 02 '19 at 18:27
  • Do you mean literally output = input? That isn't what I want. I'm looking for a pseudo-randomization of the input. – IamIC May 02 '19 at 18:36
  • 1
    So you want to perform a random permutation of the input? – user58697 May 02 '19 at 18:54
  • There are 32 and 64 bit variants that are faster than CRC listed [here](https://stackoverflow.com/questions/664014/what-integer-hash-function-are-good-that-accepts-an-integer-hash-key/12996028#12996028). It should be possible to build a 31 bit variant from this. – Thomas Mueller May 02 '19 at 18:57
  • I want to perform a bijective permutation of the input. 31-bit bit input -> bijective hash (31-bit) -> 31-bit output. – IamIC May 02 '19 at 18:57
  • I assume with "bijective" you mean you also need a way to "undo" the hash operation. Actually I wouldn't know how to do that with a CRC. – Thomas Mueller May 02 '19 at 18:59
  • I don't need to undo it. I just need a 1:1 mapping where the output "appears somewhat random" as regards the input. – IamIC May 02 '19 at 19:00
  • But you require that there are no collisions? (Meaning, no input may produce the same output as another input.) – Thomas Mueller May 02 '19 at 19:05
  • That is correct. Bijective, no collisions. CRC is bijective over its bit width. – IamIC May 02 '19 at 19:07
  • `I don't need to undo it.` Given a bijection, you won't be able to help to be able to. – greybeard May 02 '19 at 20:31
  • Undoing is not the point. Getting it done is the point :-) – IamIC May 02 '19 at 20:33
  • Whoever voted to close this... may I suggest you read the help center notes on what is on-topic. – IamIC May 02 '19 at 20:35
  • 1
    You can use any bijective function. Are you asking what bijective functions exist, or which one is better, or how to implement one, or what? – n. m. could be an AI May 02 '19 at 20:47
  • I haven't found much by way of bijective algorithms other than CRC. If there is a workable algorithm that can do what I want, that would be great :-) – IamIC May 02 '19 at 20:53
  • 3
    You could just XOR the input with any random 31-bit number. – Kyle Willmon May 02 '19 at 21:04
  • True, but an input of say {1, 2, 3} would result in an output where only the 2 least significant bits change. CRC will produce a somewhat random looking difference between these 3 numbers (e.g. 2583214201, 2337085335, 871461106}. – IamIC May 02 '19 at 21:09
  • 1
    Or, you could multiply by an (odd) prime mod 2**31 – wildplasser May 02 '19 at 21:23
  • I was *just* playing with that idea. Is that guaranteed to be bijective? Would the prime need to be within or without the range? – IamIC May 02 '19 at 21:26
  • 1
    Well, try it! (just takes 15 minutes or so) – wildplasser May 02 '19 at 21:30
  • 1
    The multiply-and-mod-*n* approach doesn't even require a prime; any odd number should do. And this can be combined with XOR-ing for a better semblance of randomness of successive integers. (You can even do a few alternating rounds of both.) – ruakh May 03 '19 at 01:15

1 Answers1

3

For any hash function hash, you can do:

function bijectiveHash31(int val) {
    val &= 0x7FFFFFFF; //make sure it's 31 bits
    for (int i=0; i<5; ++i) {
        // the high bits affect the low bits
        val ^= hash(val>>15) & 32767;
        // rotate bits
        val = ((val&32767)<<16) | ((val>>15)&65535);
    }
    return val;
}

This is a Feistel structure, which forms the basis of many ciphers: https://en.wikipedia.org/wiki/Feistel_cipher

If you need it to be fast and you don't need it to be super good, then this works fine:

function bijectiveHash31(int val) {
    val = ((val*RANDOM_ODD_NUMBER) + RANDOM_NUMBER) & 0x7FFFFFFF;
    val ^= (val>>15);
    val ^= (val>>8);
    return val;
}

In both of these cases, it's not too difficult to figure out how you could undo each elementary operation, which shows that the whole hash is bijective. If you need help establishing that for the multiplication, see https://en.wikipedia.org/wiki/Modular_multiplicative_inverse

Matt Timmermans
  • 53,709
  • 3
  • 46
  • 87
  • 1
    An alternative way to ensure it is bijective (doesn't have collisions), you could loop over all possible inputs and for each output set a bit in a 256 MiB bit field. At the end, check that all bits are set. With a decently fast programming language (C, Java,...), this only takes a few seconds. – Thomas Mueller May 03 '19 at 06:21
  • Using the second algorithm as the `hash` function for the first algorithm yields particularly impressive results. – IamIC May 03 '19 at 22:26
  • 1
    FYI I implemented 64, 48, 44, 32, 16 bit version [here](https://github.com/thomasmueller/minperf/blob/master/src/main/java/org/minperf/utils/RandomSetGenerator.java#L170). But not 31 bit. – Thomas Mueller May 09 '19 at 13:38