3

I am looking for a function that accepts 1 number from an interval 0,1....N and returns a permuted value from the same interval.

An example for 0, 1, 2, 3, 4, 5 and f(x) would be:

f(0)=5;
f(1)=1;
f(2)=0;
f(3)=4;
f(4)=2;
f(5)=3;

From what I researched/understood this is a cyclic group where f(x) is what its all based upon. If have found the function f(x) = 911 * x % N to be an example of what I want, however, patterns come up when using this. 911 is a large prime number and by changing that, I get another permutations but still, patters arise. My wish is for the results to be random-ish.

I know I can pregenerate the permutation using something like [ 0, 1, 2...N ].shuffle() or something but in my case I cannot do such a thing.

I have a service which takes for input a numeric value and returns the permuted value/position. N is set serverside, so is the function I am looking for.

Discipol
  • 3,137
  • 4
  • 22
  • 41
  • You want to have an array with random order of X number of values? – bestprogrammerintheworld May 07 '13 at 10:00
  • i'm probably dumb, but i'm unsure about what do you mean by "permutation" in here. usually, permutation means rearranging given items. like, a permutation of (2,3,5) could be (5,3,2). you feed your function one number and it returns one number. what exactly do you want to "permute"? – user151496 May 07 '13 at 10:04
  • Function accepts 1 number =x, math happens, 1 number is returned which is a random position AND UNIQUE within the same interval where x comes from. – Discipol May 07 '13 at 10:04
  • @user1916182, see my example in the question. – Discipol May 07 '13 at 10:05
  • to make it clear- so you simply want a bijection from a group of numbers into another group of numbers, where their positions in each group differ? – user151496 May 07 '13 at 10:06
  • @user1916182 I am counting on you :D – Discipol May 07 '13 at 12:08
  • i don't think that finding the kind of function is such an easy task. first of all, finding a good evenly distributed random generator "without a pattern" is a problem itself. then you want the resulting numbers to be in a very specific range. also it must be a bijection. you perhaps really should look into math formus for this. what are you trying to achieve anyways? why can't you pre-shuffle the numbers based on the number range but must do it on the fly? – user151496 May 07 '13 at 13:09
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/29541/discussion-between-discipol-and-user1916182) – Discipol May 07 '13 at 13:42
  • i don't think i'm pro-math enough to know this function. otherwise i would have told you right away :) by the way the chat says "down for maintenance" – user151496 May 07 '13 at 14:12
  • Maybe it's because it's 1am, but I still don't understand the limitation about "preshuffling". If the flow is "user sends n to server, server replies with 0-n in random order" why does it matter how the server generates these numbers? – Adrian Frühwirth May 07 '13 at 23:23
  • Is this a "I don't want to store stuff in memory" vs. (how I read it) "it is technically impossible" constraint? – Adrian Frühwirth May 07 '13 at 23:42
  • The formula(along with N) will stay in a database(several versions) and be eval()-ed for each request. I am not gonna put the preshuffled array in there too, its thousands of numbers, returned at each query? No :P – Discipol May 08 '13 at 07:27

3 Answers3

4

Please keep in mind, that the algorithm I describe here is based on the list [1, 2, ... N-1] (with length N-1). If you insist using the list [0, 1, ..., N] (with length N+1), please apply the needed minor modifications. Moreover, for brevity, I am using the % operand in a slightly different manner than most programming languages do: a % b takes values between 1 and b, not between 0 and b - 1, but of course the main idea behind is not changed, so the value of a % b is the integer between 1 and b, which is congruent with a, modulo b.

If you read this through, it will be obvious for you, that the shuffle generated is not random at all. However, with well chosen parameters, the pattern won't be easy to recognize, (my basic idea of modular exponentiation comes from cryptography, where it is important to have non-recognizable patterns and non-revertable functions).

And this is much more the language-agnostic description of the algorithm, than an actual programming solution. I do not go into details of effective implementations and pitfalls you may come across. I hope it still helps. I also coded some parts of this in python, so I can give further help and even share my code if it is needed, but that would need some completing and refactoring before.

Using exponentiation instead of multiplication to get rid of patterns

Your initial trial of f(x) = t * x % N (where you chose t to be 911) shows some patterns because multiplication holds linearity (in the 'modular' meaning of it).

You can give much more random feel if you use exponentiation instead of multiplication. Something like f(x) = t ^ x % N. However, t has to be chosen wisely (as it was in the multiplication case, to be a coprime to N), and the output given by this formula won't provide distinct numbers for distinct x values only in the case of N is prime.

University level math is coming, bear with me, I will try to keep it simple.

We will need to use primitive roots. The related Wikipedia article may help a lot, but the basic idea is that the remainders of powers of a well chosen base take all the values between 1 and N-1. For example

3^1 = 3
3^2 = 9 = 2 (mod 7)
3^3 = 27 = 6 (mod 7)
3^4 = 81 = 4 (mod 7)
3^5 = 243 = 5 (mod 7)
3^6 = 729 = 1 (mod 7)

are all different (from this point, the values are repeating from the beginning: 3^7 = 3^1 (mod 7), 3^8 = 3^2 (mod 7), and so on).

So, if your N is 7, then 3 will work fine to be t. You can use f(x) = (3 ^ x) % 7 for x values between 1 and 6.

This results in the following f:

f(1) = 3
f(2) = 2
f(3) = 6
f(4) = 4
f(5) = 5
f(6) = 1

Introducing a shift, providing some additional random effect

If you are playing with this a little bit, you can notice, that N-1 is always shuffled to 1. If you want to avoid this, we can go a step further, and choose an arbitrary number k to add after the exponentiation. That is, using g(x) = (f(x) + k) % (N-1) = ((t ^ x) % N + k) % (N-1), in our example let k be 2, resulting in the permutation:

g(1) = 5
g(2) = 4
g(3) = 2
g(4) = 6
g(5) = 1
g(6) = 3

How to choose the base

Now you get the basic feel. But how to use this generally, when N is not 7?

The key to the problem is choosing the parameter t, which was 3 in our example.

Unfortunately this is generally a hard question (mathematician's call it finding a primitive root), and there isn't any easy-to-interpret, out-of-the-box solution I am aware of.

But this is only one part of the problem. Even more sadly, a primitive root won't even work if N is a composite number. For example, if N=6, there isn't any number t for which the expression t^x modulo 6 takes all the values between 1 and 5.

But it is not too hard to solve this part.

What to do with a composite N

If N is composite, we should find a prime P, which is barely larger, and build upon the algorithm based on that one, by replacing out-of-bound numbers with their after-the-shuffle values (and iterate, if needed).

For example, if N is 6, we can choose P to be 7 and use our previously constructed g(x).

g(1) = 5 ok (5<=N-1 holds)                          h(1) = 5
g(2) = 4 ok                                         h(2) = 4
g(3) = 2 ok                                    =>   h(3) = 2
g(4) = 6 too large, using g(g(4)) = g(6) = 3        h(4) = 3
g(5) = 1 ok                                         h(5) = 1

Just to be on the safe side, I give an other example with N=4, where we use our previously calculated solution for P=7.

g(1) = 5, g(5) = 1                                  h(1) = 1
g(2) = 4, g(4) = 6, g(6) = 3                   =>   h(2) = 3
g(3) = 2                                            h(3) = 2

This should be clear now. It is wise to choose P close to N, so there aren't too many recalculations needed for out-of-bound values of g.

Back to finding t

So our only problem left is to find the primitive root which can be used as the base of the exponentiation.

If the maths on the pages I linked before cause some visceral disgust, I have some good news for you: possible good values of t are dense in the interval [2, N-1], so even random guessing may help.

There are some details how to verify effectively if a randomly chosen t is really good for us on the linked pages, but unless you are working with really huge numbers, you can just do the exponentiations and check if the number 1 appears earlier than the (N-1)th power of t (maybe you remember I noted that t^x=1 (mod N) holds always in the case of x=N-1, so an earlier appearance of 1 would break uniqueness).

I would recommend to look for a suitable t around N/2 (meaning the order of magnitude - for P=91367, t=54949 works perfectly). If you choose t to be too low (for example t=2), you can easily spot the pattern caused by exponentiation on some neighbouring x values (2+k, 4+k, 8+k, ... will appear next to each other). If t is too close to N, a similar phenomena may appear if you look at f(x) in consecutive x values of the same parity. A good choice of t should cover these patterns, and end with a result randomish enough.

Summary

So once again, here are the steps of the algorithm

(N given)

find a P prime slightly larger than N

choose an arbitrary number k between 1 and P-1

find t which is a primitive root for P

(for a given x the output shuffle h(x) is)

calculate

f(x) = (t ^ x) % P

calculate

g(x) = (f(x) + k) % (P-1)

calculate

h(x) = g(x)                                       if g(x)<=N-1,
       iterate the calculations with x = g(x)     otherwise
Community
  • 1
  • 1
elias
  • 849
  • 13
  • 28
  • I won't pretend to have understood it 100% by just giving it a quick read, but I am very much looking forward to revisiting this tomorrow with more coffee and less sleep deprivation. Great effort, +1! – Adrian Frühwirth May 07 '13 at 22:47
  • Thanks @Adrian, the problem was really challenging, so I dropped away some of my tasks, and gave it a shot. It is possible, that there is some pattern in my solution, that still should be avoided according to the question's author, but I think it's correct. However, I am also quite tired now, so it is possible that you find some fault in my heuristics. If that's the case, do not hesitate to tell, so we can correct it and come up with an ultimate solution. – elias May 07 '13 at 22:55
  • This look awesome! I am at work right now, will give it a good look later, but from what I can tell, this is the answer I sought. Cheers! – Discipol May 08 '13 at 06:50
  • I look forward to hear about your opinion once you give it a closer look. If some further questions arise about the implementation details, please feel free to contact me. – elias May 08 '13 at 06:57
2

This very closely related question (How to generate a predictable shuffling of a sequence without generating the whole sequence in advance?) and its accepted answer (and therein linked blog article and its referenced paper) might also be interesting and should solve your problem.

Community
  • 1
  • 1
Adrian Frühwirth
  • 42,970
  • 10
  • 60
  • 71
0

Based on your comment:

Function accepts 1 number =x, math happens, 1 number is returned which is a random position AND UNIQUE within the same interval where x comes from

I probably still don't understand the problem, but I give it another try. I hope it will give you some clue/help anyway!

<?php
$control = array();

$f = array(5,1,0,4,2,3); //Example

$x = 5; //Function accepts 1 number =x


for($n=1;$n<41;$n++) {
    echo doTheMath($control, $f, $x); //Echo just for example
}

function doTheMath(Array &$control, Array &$f, $x) {
    $n = count($f); //Interval count

    //1 number is returned 
    //AND
    //UNIQUE within the same interval where x comes from.

    while (!in_array($random = array_rand($f, 1), $control)) { //Get another random value from array-element if it's not in control-array

        if (!in_array($random, $control)) {
            $control[] = $random;

            //The are no more unique values, stop executing php
            if (count($control) == $n) {
                exit;
            }

            $result = 911 * $random * $n;
            return $result . '<hr />';
        }
    }


    return null;

}
?>

Example output:

10932
0
21864
27330
16398
bestprogrammerintheworld
  • 5,417
  • 7
  • 43
  • 72
  • he said he can't pre-shuffle the values. i think i understand him now, he wants a function that computes the unique "hash" position on the fly, where the hash of numbers has the same range. in other words: **the range of the function is a permutation of the given function domain** – user151496 May 07 '13 at 10:14
  • 1
    @user1916182 has got it, I think? The function in my example does the job, but like I said, that random prime number makes the output with a pattern. Mostly they look, in a graph, like shark fins. I would like are more random output, if possible. Extra browny points if I can add a seed parameter to control the randomness/hashing/permutation thingy :| I am sorry, I don't have the proper background to express like a normal human being. – Discipol May 07 '13 at 10:28
  • its in javascript. var N = 6; function perm( x ){return 911 * x % N } for( var i=0; i – Discipol May 07 '13 at 10:44
  • @Discipol - Ok you just want your JS-code "converted" to PHP ? – bestprogrammerintheworld May 07 '13 at 10:50
  • I am looking for an answer to my question, not code conversion. – Discipol May 07 '13 at 10:56
  • @Discipol - ah of course. I didn't quite understand the problem before. Sorry for that. – bestprogrammerintheworld May 07 '13 at 23:22