13

I'm working on a system that generates random ids like in answer #2 here.

My problem is, that the mentioned pseudo_encrypt() function works with int not bigint. I tried to rewrite it but it returns always the same result:

CREATE OR REPLACE FUNCTION pseudo_encrypt(VALUE bigint) returns bigint AS $$
DECLARE
l1 bigint;
l2 int;
r1 bigint;
r2 int;
i int:=0;
BEGIN
    l1:= (VALUE >> 32) & 4294967296::bigint;
    r1:= VALUE & 4294967296;
    WHILE i < 3 LOOP
        l2 := r1;
        r2 := l1 # ((((1366.0 * r1 + 150889) % 714025) / 714025.0) * 32767)::int;
        l1 := l2;
        r1 := r2;
        i := i + 1;
    END LOOP;
RETURN ((l1::bigint << 32) + r1);
END;
$$ LANGUAGE plpgsql strict immutable;

Can somebody check this?

Community
  • 1
  • 1
snøreven
  • 1,904
  • 2
  • 19
  • 39

2 Answers2

18

4294967295 must be used as the bitmask to select 32 bits (instead of 4294967296). That's the reason why currently you get the same value for different inputs.

I'd also suggest using bigint for the types of l2 and r2, they shouldn't really differ from r1 and l1

And, for better randomness, use a much higher multiplier in the PRNG function to get intermediate block that really occupy 32 bits, like 32767*32767 instead of 32767.

The complete modified version:

CREATE OR REPLACE FUNCTION pseudo_encrypt(VALUE bigint) returns bigint AS $$
DECLARE
l1 bigint;
l2 bigint;
r1 bigint;
r2 bigint;
i int:=0;
BEGIN
    l1:= (VALUE >> 32) & 4294967295::bigint;
    r1:= VALUE & 4294967295;
    WHILE i < 3 LOOP
        l2 := r1;
        r2 := l1 # ((((1366.0 * r1 + 150889) % 714025) / 714025.0) * 32767*32767)::int;
        l1 := l2;
        r1 := r2;
        i := i + 1;
    END LOOP;
RETURN ((l1::bigint << 32) + r1);
END;
$$ LANGUAGE plpgsql strict immutable;

First results:

select x,pseudo_encrypt(x::bigint) from generate_series (1, 10) as x;
 x  |   pseudo_encrypt    
----+---------------------
  1 | 3898573529235304961
  2 | 2034171750778085465
  3 |  169769968641019729
  4 | 2925594765163772086
  5 | 1061193016228543981
  6 | 3808195743949274374
  7 | 1943793931158625313
  8 |   88214277952430814
  9 | 2835217030863818694
 10 |  970815170807835400
(10 rows)
Daniel Vérité
  • 58,074
  • 15
  • 129
  • 156
  • Ah the bitmask was a stupid mistake... Thank you very much! Works perfectly now! Btw: you wanted to suggest bigint, not int. – snøreven Oct 06 '12 at 21:36
  • @Daniel Vérité What i have to modify in the fuction if i need a bigint with a max length of 13 digits? – MattC Nov 18 '15 at 09:56
  • @MattC: it's non-trivial, you may submit this as a new question. Also with this technique the upper bound is going to be `2^N` where `N` is an even number, not `10^N`. – Daniel Vérité Nov 18 '15 at 15:49
  • @DanielVérité Thank you for your answer, i've posted a new question here http://stackoverflow.com/questions/33760630/generate-unique-random-numbers-in-postgresql-with-fixed-length – MattC Nov 18 '15 at 15:54
  • I noticed the problem about the own inverse function (true for the int version of the pseudo_encrypt). This version up here does not correctly inverse. The answer below by Kurt Riede does this correctly. To try it out use: ```select x,pseudo_encrypt(x::bigint), pseudo_encrypt(pseudo_encrypt(x::bigint)) from generate_series(-10,10) as x;``` – Patrick Boos Sep 28 '16 at 16:32
  • 1
    @PatrickBoos: true, but the self inverse property is not necessarily desirable. For instance, it precludes the use of the cycle walking method, such as explained in https://en.wikipedia.org/wiki/Format-preserving_encryption#FPE_from_a_Feistel_network – Daniel Vérité Sep 28 '16 at 17:56
  • @DanielVérité good point. It might actually be desired. I agree there :) – Patrick Boos Sep 29 '16 at 11:42
  • 1
    @DanielVérité In the original they had `RETURN ((r1 << 16) + l1);`, but you flipped r1 and l1 there. What is the reason for this? – thnee May 06 '20 at 11:56
  • 2
    @thnee: the real original [as first posted](https://www.postgresql.org/message-id/448163db-cac5-4e99-8c4c-57cbc6f6af78@mm) had `return ((l1::bigint<<16) + r1)`. Swapping r1 and l1 make the function a self-inverse, which may or may not be desirable. Please see the other comments above. – Daniel Vérité May 06 '20 at 12:16
9

Old but still an interesting question. Comparing to Daniels answer I am using a slightly modified version, changing the return statement to this (exchanged r1 and l1), as also mentioned at the end of the article Pseudo encrypt:

RETURN ((r1::bigint << 32) + l1);

The reason for this change is that the underlying Feistel algorithm should not exchange left an right at the end of the last round. With this change the function regains the ability to act as its own inverse function:

pseudo_encrypt(pseudo_encrypt(x) == x // always returns true

Here is the full code in pgsql:

CREATE OR REPLACE FUNCTION pseudo_encrypt(VALUE bigint) returns bigint AS $$
DECLARE
l1 bigint;
l2 bigint;
r1 bigint;
r2 bigint;
i int:=0;
BEGIN
    l1:= (VALUE >> 32) & 4294967295::bigint;
    r1:= VALUE & 4294967295;
    WHILE i < 3 LOOP
        l2 := r1;
        r2 := l1 # ((((1366.0 * r1 + 150889) % 714025) / 714025.0) * 32767*32767)::int;
        l1 := l2;
        r1 := r2;
    i := i + 1;
    END LOOP;
RETURN ((r1::bigint << 32) + l1);
END;
$$ LANGUAGE plpgsql strict immutable;
  • 2
    Thanks so much for this. This answer produces positive bigints as apposed to the accepted answer which seems to go negative for input values >2^32-1 – halfdan Oct 25 '18 at 10:20