What is the probability that the array will remain the same?

Question

This question has been asked in Microsoft interview. Very much curious to know why these people ask so strange questions on probability?

Given a rand(N), a random generator which generates random number from 0 to N-1.

int A[N]; // An array of size N
for(i = 0; i < N; i++)
{
    int m = rand(N);
    int n = rand(N);
    swap(A[m],A[n]);
}

EDIT: Note that the seed is not fixed.

what is the probability that array A remains the same?
Assume that the array contains unique elements.

For a fixed `N` and a fixed seed, the probability is either `0` or `1` because it's not random at all. — Mysticial, Aug 08 '12 at 19:58
Is that the answer they are expecting? Or do they want a mathematical analysis assuming "true" random variables? — BlackVegetable, Aug 08 '12 at 19:58
@Mysticial: I'm pretty sure the seed isn't assumed to be fixed... — user541686, Aug 08 '12 at 19:58
Or perhaps they just wanted to see if you were capable of out-of-the-box thinking... — Mysticial, Aug 08 '12 at 20:01
@Aashish: Can you assume `N` is even, just for the sake of simplicity? (Though maybe it won't help at all, not sure...) — user541686, Aug 08 '12 at 20:01
@Mehrdad, Why even? Is the assumption of N to be even going to help us in solving the problem? — Green goblin, Aug 08 '12 at 20:03
What's the implementation for `swap()`? It could be an exercise to not trust names for face-value. — Mysticial, Aug 08 '12 at 20:04
(I'm working on the idea, I'll post it here if I have something useful.) By the way, you might want to look at [Fisher-Yates](http://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle) shuffling -- I think that's what they might've been trying to get at. i.e. They might not have been expecting you to get the exact probability, but to recognize the fact that it's a biased shuffling algorithm. — user541686, Aug 08 '12 at 20:04
Here, every time the numbers are being generated from 0 to N-1.Fischer Yates shuffling is different. — Green goblin, Aug 08 '12 at 20:05
@Aashish: I *know* Fisher-Yates is different, that's exactly why I mentioned it! *"They might not have been expecting you to get the exact probability, but to recognize the fact that it's a biased shuffling algorithm."* — user541686, Aug 08 '12 at 20:06
They're asking this question because they don't want [this thing](http://www.robweir.com/blog/2010/02/microsoft-random-browser-ballot.html) to happen again. — R. Martinho Fernandes, Aug 08 '12 at 20:17
I want to know what knowing this answer has to do with the job you applied for. — Burhan Khalid, Aug 08 '12 at 20:18
If this is indeed a C program (as the tags suggest), the probability is 1. The array elements are passed by value (there is no pass by reference in C), so the `swap` function cannot possibly change the contents of `A`. \edit: Well, `swap` could be a macro as well… let's just hope it's not :) — reima, Aug 08 '12 at 20:25
@reima: writing a C function with prototype "swap(int, int)" is a bad coding style since such a function cannot do what it is expected from its name. So we may multiply the probability that the array will change to the probability of having a macro here (which is something like 99% :) — Evgeny Kluev, Aug 08 '12 at 20:45
@Aashish, N being even or odd matters right? Say N is even = 4 . If you m=1, n=3 for i=0, there should be a n=1, m=3 in one of i=1,2,3. which implies you want N to be even for this login, If N is odd, you need to add the probability that for i=k, k being 0 to N-1, m=n, the rest of the logic being same as for N is even — Jeff, Aug 08 '12 at 21:09
Are we assuming that `rand()` is supposed to be a uniform distribution? Is it implemented using `%`, and does it suffer from the slight bias toward small numbers as a result? — twalberg, Aug 08 '12 at 21:48
I think we've been assuming that rand() picks each number with uniform probability. — Dennis Meng, Aug 08 '12 at 22:14
I think the answers are too far from an analytical solution even for the uniform case, so I'd rather not get too fancy. — Ivan Vergiliev, Aug 08 '12 at 22:19
Removed the c tag because there's absolutely nothing in the problem itself that says that this has to be in C. — Dennis Meng, Aug 08 '12 at 22:24
@BurhanKhalid see my answer, answering this question demonstrates that the applicant is able to *think*. — Kirk Broadhurst, Aug 08 '12 at 23:32
Consider asking on math.stackexchange.com. A rephrasing: given random a_i, b_i, what is the probability that the permutation (a_1 b_1) (a_2 b_2) ... (a_n b_n) = identity in symmetric group S_n? — sdcvvc, Aug 09 '12 at 01:13
you may find this interesting: http://code.google.com/codejam/contest/dashboard?c=975485#s=a&a=3 — Rusty Rob, Aug 09 '12 at 01:20
First, ask what "remain the same" means? The simpler calculation is for when no alterations are made at any point. It gets more complicated if we only care whether the array is in the same order at the end as it was when we started, — OrangeDog, Aug 09 '12 at 14:13
Uh... nobody picked up on the fact that A is NEVER INITIALIZED? — Ken Beckett, Aug 14 '12 at 18:12
@KenBeckett That's just what I was thinking, assuming this is not just simplified code. — Samuel Parkinson, Aug 14 '12 at 19:13
I don't have time to analyse it at the moment, but could this be solved by Mathematical Induction on N? — Mark Hurd, Aug 15 '12 at 03:56

User · Accepted Answer · 2012-08-15T14:07:16.523

_{Well I had a little fun with this one. The first thing I thought of when I first read the problem was group theory (the symmetric group S_n, in particular). The for loop simply builds a permutation σ in S_n by composing transpositions (i.e. swaps) on each iteration. My math is not all that spectacular and I'm a little rusty, so if my notation is off bear with me.}

Overview

Let A be the event that our array is unchanged after permutation. We are ultimately asked to find the probability of event A, Pr(A).

My solution attempts to follow the following procedure:

Consider all possible permutations (i.e. reorderings of our array)
Partition these permutations into disjoint sets based on the number of so-called identity transpositions they contain. This helps reduce the problem to even permutations only.
Determine the probability of obtaining the identity permutation given that the permutation is even (and of a particular length).
Sum these probabilities to obtain the overall probability the array is unchanged.

1) Possible Outcomes

Notice that each iteration of the for loop creates a swap (or transposition) that results one of two things (but never both):

Two elements are swapped.
An element is swapped with itself. For our intents and purposes, the array is unchanged.

We label the second case. Let's define an identity transposition as follows:

An identity transposition occurs when a number is swapped with itself. That is, when n == m in the above for loop.

For any given run of the listed code, we compose N transpositions. There can be 0, 1, 2, ... , N of the identity transpositions appearing in this "chain".

For example, consider an N = 3 case:

Given our input [0, 1, 2].
Swap (0 1) and get [1, 0, 2].
Swap (1 1) and get [1, 0, 2]. ** Here is an identity **
Swap (2 2) and get [1, 0, 2]. ** And another **

Note that there is an odd number of non-identity transpositions (1) and the array is changed.

2) Partitioning Based On the Number of Identity Transpositions

Let K_i be the event that i identity transpositions appear in a given permutation. Note this forms an exhaustive partition of all possible outcomes:

No permutation can have two different quantities of identity transpositions simultaneously, and
All possible permutations must have between 0 and N identity transpositions.

Thus we can apply the Law of Total Probability:

Now we can finally take advantage of the the partition. Note that when the number of non-identity transpositions is odd, there is no way the array can go unchanged*. Thus:

*_{From group theory, a permutation is even or odd but never both. Therefore an odd permutation cannot be the identity permutation (since the identity permutation is even).}

3) Determining Probabilities

So we now must determine two probabilities for N-i even:

The First Term

The first term, Pr(K_i) , represents the probability of obtaining a permutation with i identity transpositions. This turns out to be binomial since for each iteration of the for loop:

The outcome is independent of the results before it, and
The probability of creating an identity transposition is the same, namely 1/N.

Thus for N trials, the probability of obtaining i identity transpositions is:

The Second Term

So if you've made it this far, we have reduced the problem to finding Pr(A|K_i) for N - i even. This represents the probability of obtaining an identity permutation given i of the transpositions are identities. I use a naive counting approach to determine the number of ways of achieving the identity permutation over the number of possible permutations.

First consider the permutations (n, m) and (m, n) equivalent. Then, let M be the number of non-identity permutations possible. We will use this quantity frequently.

The goal here is to determine the number of ways a collections of transpositions can be combined to form the identity permutation. I will try to construct the general solution along side an example of N = 4.

Let's consider the N = 4 case with all identity transpositions (i.e. i = N = 4). Let X represent an identity transposition. For each X, there are N possibilities (they are: n = m = 0, 1, 2, ... , N - 1). Thus there are N^i = 4^4 possibilities for achieving the identity permutation. For completeness, we add the binomial coefficient, C(N, i), to consider ordering of the identity transpositions (here it just equals 1). I've tried to depict this below with the physical layout of elements above and the number of possibilities below:

I  =  _X_   _X_   _X_   _X_
       N  *  N  *  N  *  N  * C(4, 4) => N^N * C(N, N) possibilities

Now without explicitly substituting N = 4 and i = 4, we can look at the general case. Combining the above with the denominator found previously, we find:

This is intuitive. In fact, any other value other than 1 should probably alarm you. Think about it: we are given the situation in which all N transpositions are said to be identities. What's the probably that the array is unchanged in this situation? Clearly, 1.

Now, again for N = 4, let's consider 2 identity transpositions (i.e. i = N - 2 = 2). As a convention, we will place the two identities at the end (and account for ordering later). We know now that we need to pick two transpositions which, when composed, will become the identity permutation. Let's place any element in the first location, call it t1. As stated above, there are M possibilities supposing t1 is not an identity (it can't be as we have already placed two).

I  =  _t1_   ___   _X_   _X_
       M   *  ?  *  N  *  N

The only element left that could possibly go in the second spot is the inverse of t1, which is in fact t1 (and this is the only one by uniqueness of inverse). We again include the binomial coefficient: in this case we have 4 open locations and we are looking to place 2 identity permutations. How many ways can we do that? 4 choose 2.

I  =  _t1_   _t1_   _X_   _X_ 
       M   *  1   *  N  *  N  * C(4, 2) => C(N, N-2) * M * N^(N-2) possibilities

Again looking at the general case, this all corresponds to:

Finally we do the N = 4 case with no identity transpositions (i.e. i = N - 4 = 0). Since there are a lot of possibilities, it starts to get tricky and we must be careful not to double count. We start similarly by placing a single element in the first spot and working out possible combinations. Take the easiest first: the same transposition 4 times.

I  =  _t1_   _t1_   _t1_   _t1_ 
       M   *  1   *  1   *  1   => M possibilities

Let's now consider two unique elements t1 and t2. There are M possibilities for t1 and only M-1 possibilities for t2 (since t2 cannot be equal to t1). If we exhaust all arrangements, we are left with the following patterns:

I  =  _t1_   _t1_   _t2_   _t2_ 
       M   *  1   *  M-1 *  1   => M * (M - 1) possibilities   (1)st

   =  _t1_   _t2_   _t1_   _t2_
       M   *  M-1 *  1   *  1   => M * (M - 1) possibilities   (2)nd

   =  _t1_   _t2_   _t2_   _t1_
       M   *  M-1 *  1   *  1   => M * (M - 1) possibilities   (3)rd

Now let's consider three unique elements, t1, t2, t3. Let's place t1 first and then t2. As usual, we have:

I  =  _t1_   _t2_   ___   ___ 
       M   *  ?   *  ?  *  ?

We can't yet say how many possible t2s there can be yet, and we will see why in a minute.

We now place t1 in the third spot. Notice, t1 must go there since if were to go in the last spot, we would just be recreating the (3)rd arrangement above. Double counting is bad! This leaves the third unique element t3 to the final position.

I  =  _t1_   _t2_   _t1_   _t3_ 
       M   *  ?   *  1  *   ?

So why did we have to take a minute to consider the number of t2s more closely? The transpositions t1 and t2 cannot be disjoint permutations (i.e. they must share one (and only one since they also cannot be equal) of their n or m). The reason for this is because if they were disjoint, we could swap the order of permutations. This means we would be double counting the (1)st arrangement.

Say t1 = (n, m). t2 must be of the form (n, x) or (y, m) for some x and y in order to be non-disjoint. Note that x may not be n or m and y many not be n or m. Thus, the number of possible permutations that t2 could be is actually 2 * (N - 2).

So, coming back to our layout:

I  =  _t1_    _t2_    _t1_   _t3_ 
       M   * 2(N-2) *  1   *  ?

Now t3 must be the inverse of the composition of t1 t2 t1. Let's do it out manually:

(n, m)(n, x)(n, m) = (m, x)

Thus t3 must be (m, x). Note this is not disjoint to t1 and not equal to either t1 or t2 so there is no double counting for this case.

I  =  _t1_    _t2_    _t1_   _t3_ 
       M   * 2(N-2) *  1  *   1    => M * 2(N - 2) possibilities

Finally, putting all of these together:

4) Putting it all together

So that's it. Work backwards, substituting what we found into the original summation given in step 2. I computed the answer to the N = 4 case below. It matches the empirical number found in another answer very closely!

         N  =  4
         M  =  6   _________ _____________ _________
                  | Pr(K_i) | Pr(A | K_i) | Product | 
         _________|_________|_____________|_________|
        |         |         |             |         |
        |  i = 0  |  0.316  |  120 / 1296 |  0.029  |
        |_________|_________|_____________|_________|
        |         |         |             |         |
        |  i = 2  |  0.211  |    6 / 36   |  0.035  |
        |_________|_________|_____________|_________|
        |         |         |             |         |
        |  i = 4  |  0.004  |    1 / 1    |  0.004  |
        |_________|_________|_____________|_________|
                            |             |         |
                            |     Sum:    |  0.068  |
                            |_____________|_________|

Correctness

It would be cool if there was a result in group theory to apply here-- and maybe there is! It would certainly help make all this tedious counting go away completely (and shorten the problem to something much more elegant). I stopped working at N = 4. For N > 5, what is given only gives an approximation (how good, I'm not sure). It is pretty clear why that is if you think about it: for example, given N = 8 transpositions, there are clearly ways of creating the identity with four unique elements which are not accounted for above. The number of ways becomes seemingly more difficult to count as the permutation gets longer (as far as I can tell...).

Anyway, I definitely couldn't do something like this within the scope of an interview. I would get as far as the denominator step if I was lucky. Beyond that, it seems pretty nasty.

I don't think you right because there are dependencies between the events - Pr(A and K = 0) + Pr(A and K = 1) + ... + Pr(A and K = N). And according to your analyze you assume not. — barak1412, Aug 09 '12 at 08:07
Hmm, thanks for the note. Maybe I am applying it incorrectly but I was considering the result: Let K_1, K_2, ..., K_N be a partition of our space S. Let A be a subspace of S. Then (sometimes called total probability, I think): P(A) = Sum of Pr(A | K_i) * P(K_i). I'm pretty sure dividing in this way creates a partition as well: no permutation can be in two different subsets (because it has a _single_ number of identities) and all possible permutations are accounted for. — User, Aug 09 '12 at 13:23
afaict what you have is fine (except for the numerator appearing out of nowhere...) — andrew cooke, Aug 09 '12 at 14:10
Certainly the answer is weak in that respect. I tried to give some insight with the example but admittedly it's not very good. It's mostly just trying to count and count efficiently wherever possible (by using patterns of what could possibly occur). I checked it for a few cases but obviously not all; there should probably be an asterisk there. If there was a nice means of calculating that probability, I think this solution would be pretty neat. It definitely eluded me though! :) — User, Aug 09 '12 at 14:30
This is the right answer. It hits all the right buzz words ("permutation", "transposition", "identity") and the reasoning is right: 1) consider number of actual transpositions, 2) consider that you need an even number of transpositions and 3) count all the number of permuted pairs that make up an identity permutation. One criticism though: if you had written things in a more concise manner it would've probably been more obvious that you're getting it right. — Jérémie, Aug 13 '12 at 23:35
Thanks for the comments (and upvotes). They inspired me to redo this post much more carefully ;). I know it is still incredibly verbose (as far as mathematics go) but I attempted to follow things through step-by-step as clearly as I could. — User, Aug 15 '12 at 05:18

Kirk Broadhurst · Answer 2 · 2012-08-15T08:44:42.020

Very much curious to know why these people ask so strange questions on probability?

Questions like this are asked because they allow the interviewer to gain insight into the interviewee's

ability read code (very simple code but at least something)
ability to analyse an algorithm to identify execution path
skills at applying logic to find possible outcomes and edge case
reasoning and problem solving skills as they work through the problem
communication and work skills - do they ask questions, or work in isolation based on information at hand

... and so on. The key to having a question that exposes these attributes of the interviewee is to have a piece of code that is deceptively simple. This shakes out the imposters the non-coder is stuck; the arrogant jump to the wrong conclusion; the lazy or sub-par computer scientist finds a simple solution and stops looking. Often, as they say, it's not whether you get the right answer but whether you impress with your thought process.

I'll attempt to answer the question, too. In an interview I'd explain myself rather than provide a one-line written answer - this is because even if my 'answer' is wrong, I am able to demonstrate logical thinking.

A will remain the same - i.e. elements in the same positions - when

m == n in every iteration (so that every element only swaps with itself); or
any element that is swapped is swapped back to its original position

The first case is the 'simple' case that duedl0r gives, the case that the array isn't altered. This might be the answer, because

what is the probability that array A remains the same?

if the array changes at i = 1 and then reverts back at i = 2, it's in the original state but it didn't 'remain the same' - it was changed, and then changed back. That might be a smartass technicality.

Then considering the chance of elements being swapped and swapped back - I think that calculation is above my head in an interview. The obvious consideration is that that does not need to be a change - change back swap, there could just as easily be a swap between three elements, swapping 1 and 2, then 2 and 3, 1 and 3 and finally 2 and 3. And continuing, there could be swaps between 4, 5 or more items that are 'circular' like this.

In fact, rather than considering the cases where the array is unchanged, it may be simpler to consider the cases where it is changed. Consider whether this problem can be mapped onto a known structure like Pascal's triangle.

This is a hard problem. I agree that it's too hard to solve in an interview, but that doesn't mean it is too hard to ask in an interview. The poor candidate won't have an answer, the average candidate will guess the obvious answer, and the good candidate will explain why the problem is too hard to answer.

I consider this an 'open-ended' question that gives the interviewer insight into the candidate. For this reason, even though it's too hard to solve during an interview, it is a good question to ask during an interview. There's more to asking a question than just checking whether the answer is right or wrong.

what about longer cycles? can they not exist? or are they increasingly unlikely as the cycle length increases? one would hope one of the two cases is true, and it would be good to give some kind of quantitative estimate on how important they are. — andrew cooke, Aug 08 '12 at 22:05
unfortunately it doesn't go like that...when you swap 1 and 2, then 2 and 3 and then 1 and 3 it will not be the first state...the latter one needs to be swapped again... — Erdem E., Aug 08 '12 at 23:07
right, but that's not a proof that no sequence with more than 2 swaps is possible. in your argument, four swaps makes everything ok, for example. — andrew cooke, Aug 08 '12 at 23:10
@andrewcooke longer cycles can exist but would assume their probably decreases exponentially in relation to their length. Presumably there'll be some kind of limit involved. — Kirk Broadhurst, Aug 08 '12 at 23:20
IMO, this is a pretty terrible interview question. It's one thing to see how someone reasons through a typical algorithm problem. It's another to throw a problem that a bunch of smart people on SO can't give a reasonable answer to, because any answer is going to be essentially be BS, however well reasoned, so inferring anything of things listed is a bit of a shell game. — dfb, Aug 09 '12 at 02:37
@dfb somewhat agree, but having 'easy' questions is often a waste of everyone's time. In this case you'd want to see your candidate go through the steps, see the problems with the naive solution (i.e. m==n), consider the complexities. Interviewers also like to see candidate remain calm when confronted with something difficult, admit they don't know rather than bluff, and so on. — Kirk Broadhurst, Aug 09 '12 at 04:13
I have to disagree with some of this answer. This is a terrible, terrible interview question because it doesn't show any problem-solving skills. It's utterly trivial to see that as `N` approaches `infinity`, the probability of the array to change also approaches `infinity`. It's also trivial to see that when `N` is `1`, that probability is `0` and when `N` is `2`, that probability is `50%`. So that's that. But asking you for a generalized solution? Come on. It's just asinine mental masturbation. — David Titarenco, Aug 12 '12 at 05:02
The problem has an easy interpretation (answer: 1/N^N) which tests basic probability knowledge, and a hard interpretation which tests advanced knowledge of group theory and combinatorics. I suspect the interviewer meant the former and was just bad at communicating. The hard version is much harder than problems on qualifying exams I took in graduate abstract algebra. Much too hard for an interview, unless it's for a research position at MSR. BTW, the probability approaches 1, not infinity. — Lucas Wiman, Aug 14 '12 at 21:21

score 10 · Answer 3 · answered Aug 09 '12 at 01:01

Below is C code to count the number of values of the 2N-tuple of indices that rand can produce and calculate the probability. Starting with N = 0, it shows counts of 1, 1, 8, 135, 4480, 189125, and 12450816, with probabilities of 1, 1, .5, .185185, .0683594, .0193664, and .00571983. The counts do not appear in the Encyclopedia of Integer Sequences, so either my program has a bug or this is a very obscure problem. If so, the problem is not intended to be solved by a job applicant but to expose some of their thought processes and how they deal with frustration. I would not regard it as a good interview problem.

#include <inttypes.h>
#include <math.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>


#define swap(a, b)  do { int t = (a); (a) = (b); (b) = t; } while (0)


static uint64_t count(int n)
{
    // Initialize count of how many times the original order is the result.
    uint64_t c = 0;

    // Allocate space for selectors and initialize them to zero.
    int *r = calloc(2*n, sizeof *r);

    // Allocate space for array to be swapped.
    int *A = malloc(n * sizeof *A);

    if (!A || !r)
    {
        fprintf(stderr, "Out of memory.\n");
        exit(EXIT_FAILURE);
    }

    // Iterate through all values of selectors.
    while (1)
    {
        // Initialize A to show original order.
        for (int i = 0; i < n; ++i)
            A[i] = i;

        // Test current selector values by executing the swap sequence.
        for (int i = 0; i < 2*n; i += 2)
        {
            int m = r[i+0];
            int n = r[i+1];
            swap(A[m], A[n]);
        }

        // If array is in original order, increment counter.
        ++c;    // Assume all elements are in place.
        for (int i = 0; i < n; ++i)
            if (A[i] != i)
            {
                // If any element is out of place, cancel assumption and exit.
                --c;
                break;
            }

        // Increment the selectors, odometer style.
        int i;
        for (i = 0; i < 2*n; ++i)
            // Stop when a selector increases without wrapping.
            if (++r[i] < n)
                break;
            else
                // Wrap this selector to zero and continue.
                r[i] = 0;

        // Exit the routine when the last selector wraps.
        if (2*n <= i)
        {
            free(A);
            free(r);
            return c;
        }
    }
}


int main(void)
{
    for (int n = 0; n < 7; ++n)
    {
        uint64_t c = count(n);
        printf("N = %d:  %" PRId64 " times, %g probabilty.\n",
            n, c, c/pow(n, 2*n));
    }

    return 0;
}

An independent Python program gives the same results up to N=6, so your implementation is correct. Each count can be divided by N^3, but that doesn't appear in OEIS either. — ecatmur, Aug 09 '12 at 09:45

score 10 · Answer 4 · answered Aug 18 '12 at 17:49

The behaviour of the algorithm can be modelled as a Markov chain over the symmetric group S_N.

Basics

The N elements of the array A can be arranged in N! different permutations. Let us number these permutations from 1 to N!, e.g. by lexicographic ordering. So the state of the array A at any time in the algorithm can be fully characterized by the permutation number.

For example, for N = 3, one possible numbering of all 3! = 6 permutations might be:

a b c
a c b
b a c
b c a
c a b
c b a

State transition probabilities

In each step of the algorithm, the state of A either stays the same or transitions from one of these permutations to another. We are now interested in the probabilities of these state changes. Let us call Pr(i → j) the probability that the state changes from permutation i to permutation j in a single loop iteration.

As we pick m and n uniformly and independently from the range [0, N-1], there are N² possible outcomes, each of which is equally likely.

Identity

For N of these outcomes m = n holds, so there is no change in the permutation. Therefore,

.

Transpositions

The remaining N² - N cases are transpositions, i.e. two elements exchange their positions and therefore the permutation changes. Suppose one of these transpositions exchanges the elements at positions x and y. There are two cases how this transposition can be generated by the algorithm: either m = x, n = y or m = y, n = x. Thus, the probability for each transposition is 2 / N².

How does this relate to our permutations? Let us call two permutations i and j neighbors if and only if there is a transposition which transforms i into j (and vice versa). We then can conclude:

Transition matrix

We can arrange the probabilities Pr(i → j) in a transition matrix P ∈ [0,1]^N!×N!. We define

p_ij = Pr(i → j),

where p_ij is the entry in the i-th row and j-th column of P. Note that

Pr(i → j) = Pr(j → i),

which means P is symmetric.

The key point now is the observation of what happens when we multiply P by itself. Take any element p⁽²⁾_ij of P²:

The product Pr(i → k) · Pr(k → j) is the probability that starting at permutation i we transition into permutation k in one step, and transition into permutation j after another subsequent step. Summing over all in-between permutations k therefore gives us the total probability of transitioning from i to j in 2 steps.

This argument can be extended to higher powers of P. A special consequence is the following:

p^(N)_ii is the probability of returning back to permutation i after N steps, assuming we started at permutation i.

Example

Let's play this through with N = 3. We already have a numbering for the permutations. The corresponding transition matrix is as follows:

Multiplying P with itself gives:

Another multiplication yields:

Any element of the main diagonal gives us the wanted probability, which is ¹⁵/₈₁ or ⁵/₂₇.

Discussion

While this approach is mathematically sound and can be applied to any value of N, it is not very practical in this form. The transition matrix P has N!² entries, which becomes huge very fast. Even for N = 10 the size of the matrix already exceeds 13 trillion entries. A naive implementation of this algorithm therefore appears to be infeasible.

However, in comparison to other proposals, this approach is very structured and doesn't require complex derivations beyond figuring out which permutations are neighbors. My hope is that this structuredness can be exploited to find a much simpler computation.

For example, one could exploit the fact that all diagonal elements of any power of P are equal. Assuming we can easily calculate the trace of P^N, the solution is then simply tr(P^N) / N!. The trace of P^N is equal to the sum of the N-th powers of its eigenvalues. So if we had an efficient algorithm to compute the eigenvalues of P, we would be set. I haven't explored this further than calculating the eigenvalues up to N = 5, however.

Glancing through, this appears the only sound/correct answer on the entire topic. Have you had any luck generalizing the result? I'm going to have a look tonight. — Gleno, Aug 20 '12 at 04:35
@Gleno I haven't looked into it beyond what I've already written. I would love to hear about your findings! — reima, Aug 21 '12 at 19:40
Well, I haven't had any time to work on the problem yet. I verified your findings (derived same matrix for n=3, same probabilities, etc) and computed eigenvalues and probabilities up to N = 7. There clearly is a pattern, but it didn't jump out at me when looking at the eigenvalue sequences. I also tried cheating by looking at the diagonal elements of the matrices to n'th power, and checking if these followed any known sequence - but alas they do not. A bit sad that your approach, which has yielded probabilities for up to N=7 at this point, is berried so low on the page. — Gleno, Aug 21 '12 at 21:15

score 4 · Answer 5 · answered Aug 09 '12 at 02:04

It's easy to observe bounds 1/nⁿ <= p <= 1/n.

Here is an incomplete idea of showing an inverse-exponential upper bound.

You're drawing numbers from {1,2,..,n} 2n times. If any of them is unique (occurs exactly once), the array will definitely be changed, as the element has gone away and cannot return at its original place.

The probability that a fixed number is unique is 2n * 1/n * (1-1/n)^(2n-1)=2 * (1-1/n)^(2n-1) which is asympotically 2/e², bounded away from 0. [2n because you choose on which try you get it, 1/n that you got it on that try, (1-1/n)^(2n-1) that you did not get it on other tries]

If the events were independent, you'd get that chance that all numbers are nonunique is (2/e²)^n, which would mean p <= O((2/e²)^n). Unfortunately, they are not independent. I feel that the bound can be shown with more sophisticated analysis. The keyword is "balls and bins problem".

duedl0r · Answer 6 · 2012-08-09T07:12:59.427

3

One simplistic solution is

p >= 1 / N^N

Since one possible way the array stays the same is if m = n for every iteration. And m equals n with possibility 1 / N.

It's certainly higher than that. The question is by how much..

Second thought: One could also argue, that if you shuffle an array randomly, every permutation has equal probability. Since there are n! permutations the probability of getting just one (the one we have at the beginning) is

p = 1 / N!

which is a bit better than the previous result.

As discussed, the algorithm is biased. This means not every permutation has the same probability. So 1 / N! is not quite exact. You have to find out how the distribution of the permutations are.

edited Aug 09 '12 at 07:12

answered Aug 08 '12 at 21:30

duedl0r

9,289
3
30
45

1

The array is shuffled randomly, the question is whether it is shuffled uniformly. The two concepts are unrelated. – Dietrich Epp Aug 08 '12 at 22:01
1

@DietrichEpp: yes, then you have to demonstrate whether this algorithm is biased or not. If it isn't biased the it's uniformly distributed. – duedl0r Aug 08 '12 at 22:04
5

Ah, I just remembered a super easy proof that the algorithm is biased. The algorithm can execute in N^N different ways, uniformly, but the number of permutations N! is almost certainly not a divisor of N^N, therefore, **the algorithm is biased for N >= 3.** – Dietrich Epp Aug 08 '12 at 22:16
@DietrichEpp For a proof that `n!` doesn't always divide `n^n`, just consider the case where `n` is a prime greater than 2. – Dennis Meng Aug 08 '12 at 22:19
@DietrichEpp: Nice! But I can't see where the bias is coming from. Or how much it differs from an uniformly distribution..? – duedl0r Aug 08 '12 at 22:27
oh! so it does (sorry). i'll delete my comment since it's wrong. but does Dietrich's proof still hold? – andrew cooke Aug 08 '12 at 22:27
@duedl0r: It's a proof by contradiction, so it doesn't help you see where the bias is coming from, it doesn't even help you figure out how much bias there is, it only helps you figure out that the bias exists. – Dietrich Epp Aug 08 '12 at 22:29
@DennisMeng: All numbers N >= 3 are greater than a prime number which they are not divisible by, so the algorithm is biased for N >= 3. – Dietrich Epp Aug 08 '12 at 22:30
but it's not correct, is it? it's for the famously wrong case. the code above doesn't have N^N distinct paths. some overlap (i may be wrong here...). – andrew cooke Aug 08 '12 at 22:31
@andrewcooke: Ah, you're right. It has N^2N paths. Same proof still works. – Dietrich Epp Aug 08 '12 at 22:32
@DietrichEpp I would've used the fact that if `a | bc` and `gcd(a,b) = 1`, then `a|c` and used contradiction. – Dennis Meng Aug 08 '12 at 22:34
@andrewcooke: Overlaps have nothing to do with it. You generate two numbers each cycle, each from a set of N possible numbers. Therefore there are N^2N possible outputs from the random number generator, and therefore it is impossible to generate a distribution with probabilities exactly probability 1/N! (for N >= 3). In order to get such a probability, you **have** to call the random number generator in a different way. – Dietrich Epp Aug 08 '12 at 22:35
@DietrichEpp: Hmm, can you really prove it like that? Say, you have a set with 3 elements and pick 4 times at random. With your proof this would be biased. But it can't be, right? :) – duedl0r Aug 08 '12 at 22:36
I'm pretty sure the total number of possibilities is `((n choose 2) + n)^n`, not `n^(2n)`. – Dennis Meng Aug 08 '12 at 22:36
@DennisMeng: No, that assumes that two distinct positions are swapped each round, which is not how the algorithm works. – Dietrich Epp Aug 08 '12 at 22:37
Yeah, I just realized that my reasoning wasn't quite right about 10 seconds ago, my bad : – Dennis Meng Aug 08 '12 at 22:38
@duedl0r: If you have a set with 3 elements, and pick 4 times at random, there are 6561 different ways that you can pick. However, there are 6 different permutations. No matter how those 6561 different ways are divided up, you will never get a probability of 1/6 (because 6 is not a factor of 6561). So, despite the fact that it "can't be", it most definitely is. – Dietrich Epp Aug 08 '12 at 22:40
@DietrichEpp: Ok I see, but how would you verify that? You probably run this `n` times and then make some statistics. Doesn't it cancel out in the long run? And is the distribution skew even noticeable? – duedl0r Aug 08 '12 at 22:47
1

@duedl0r: It's a mathematical proof, so you verify it by examining the logical steps and making sure that they follow correctly. There is no need to actually run the program, if you already know what the answer is. As I said, it's not a constructive proof, so the proof doesn't give you any information about how much bias there is, it only proves that there MUST be bias, because there would be a logical contradiction otherwise. When you say it "cancels out", that is really just a guess, but you are welcome to try to prove that it does cancel out. – Dietrich Epp Aug 08 '12 at 22:58

dfb · Answer 7 · 2012-08-09T23:30:35.893

3

FYI, not sure the bound above (1/n^2) holds:

N=5 -> 0.019648 < 1/25
N=6 -> 0.005716 < 1/36

Sampling code:

import random

def sample(times,n):
    count = 0;
    for i in range(times):
        count += p(n)
    return count*1.0/times;

def p(n):
    perm = range(n);
    for i in range(n):
        a = random.randrange(n)
        b = random.randrange(n)

        perm[a],perm[b]=perm[b],perm[a];


    return perm==range(n)

print sample(500000,5)

edited Aug 09 '12 at 23:30

answered Aug 08 '12 at 21:47

dfb

13,133
2
31
52

note you can swap by going perm[a], perm[b] = perm[b], perm[a] – Rusty Rob Aug 09 '12 at 04:06

score 3 · Answer 8 · edited Aug 13 '12 at 16:06

3

Everyone assumes that A[i] == i, which was not explicitly stated. I'm going to make this assumption too, but note that the probability depends on the contents. For example if A[i]=0, then the probability = 1 for all N.

Here's how to do it. Let P(n,i) be the probability that the resulting array differs by exactly i transpositions from the original array.

We want to know P(n,0). It's true that:

P(n,0) = 
1/n * P(n-1,0) + 1/n^2 * P(n-1,1) = 
1/n * P(n-1,0) + 1/n^2 * (1-1/(n-1)) * P(n-2,0)

Explanation: we can get the original array in two ways, either by making a "neutral" transposition in an array that's already good, or by reverting the only "bad" transposition. To get an array with only one "bad" transposition, we can take an array with 0 bad transpositions and make one transposition that is not neutral.

EDIT: -2 instead of -1 in P(n-1,0)

edited Aug 13 '12 at 16:06

user1367401

1,030
17
26

answered Aug 13 '12 at 15:47

Leonhard Euler

39
2

1

1) The question states "Assume that the array contains unique elements." so without loss of generality you can assume `A[i] == i`. 2) I don't think there is an easy way to find `P(n,i)` in general. It is easy for `P(n,0)`, but for larger `i` it is unclear how many transpositions are "good" and how many "bad". – sdcvvc Aug 13 '12 at 17:27
@sdcvvc: You say that it's easy to find `P(n,0)` but that is exactly what the question is about. If you disagree with this reasoning give an example where it fails, or why is it incorrect. – Leonhard Euler Aug 13 '12 at 18:03
I don't think that `P(n-1,1) = (1-1/(n-1)) * P(n-2,0)`. If you are in the middle, you can add another bad step that you will undo later. For example, consider (1234) => (2134) => (2143) => (1243) => (1234). The number n-1 in denominator is also suspect since the random function allows repetition. – sdcvvc Aug 13 '12 at 18:27

score 1 · Answer 9 · answered Aug 08 '12 at 23:33

It's not a full solution, but it's something at least.

Take a particular set of swaps that have no effect. We know that it must have been the case that its swaps ended up forming a bunch of loops of different sizes, using a total of n swaps. (For the purposes of this, a swap with no effect can be considered a loop of size 1)

Perhaps we can

1) Break them down into groups based on what the sizes of the loops are
2) Calculate the number of ways to get each group.

(The main problem is that there are a ton of different groups, but I'm not sure how you'd actually calculate this if you don't take into account the different groupings.)

barak1412 · Answer 10 · 2012-08-09T20:12:54.990

Interesting question.

I think the answer is 1/N, but I don't have any proof. When I find a proof, I will edit my answer.

What I got until now:

If m == n, You won't change the array. The probability to get m == n is 1/N, because there are N^2 options, and only N is suitable ((i,i) for every 0 <= i <= N-1).

Thus, we get N/N^2 = 1/N.

Denote Pk the probability that after k iterations of swaps, the array of size N will remain the same.

P1 = 1/N. (As we saw below)

P2 = (1/N)P1 + (N-1/N)(2/N^2) = 1/N^2 + 2(N-1) / N^3.

Explanation for P2:
We want to calculate the probability that after 2 iterations, the array with 
N elements won't change. We have 2 options : 
- in the 2 iteration we got m == n (Probability of 1/N)
- in the 2 iteration we got m != n (Probability of N-1/N)

If m == n, we need that the array will remain after the 1 iteration = P1.
If m != n, we need that in the 1 iteration to choose the same n and m 
(order is not important). So we get 2/N^2.
Because those events are independent we get - P2 = (1/N)*P1 + (N-1/N)*(2/N^2).

Pk = (1/N)*Pk-1 + (N-1/N)*X. (the first for m == n, the second for m != n)

I have to think more about what X equals. (X is just a replacement for the real formula, not a constant or anything else)

Example for N = 2.
All possible swaps:

(1 1 | 1 1),(1 1 | 1 2),(1 1 | 2 1),(1 1 | 2 2),(1 2 | 1 1),(1 2 | 1 2)
(1 2 | 2 1),(1 2 | 2 2),(2 1 | 1 1),(2 1 | 1 2),(2 1 | 2 1),(2 1 | 2 2)
(2 2 | 1 1),(2 2 | 1 2),(2 2 | 2 1),(2 1 | 1 1).

Total = 16. Exactly 8 of them remain the array the same.
Thus, for N = 2, the answer is 1/2.

EDIT : I want to introduce another approach:

We can classify swaps to three groups: constructive swaps, destructive swaps and harmless swaps.

Constructive swap is defined to be a swap that cause at least one element to move to its right place.

Destructive swap is defined to be a swap that cause at least one element to move from its correct position.

Harmless swap is defined to be a swap that does not belong to the other groups.

It is easy to see that this is a partition of all possible swaps. (intersection = empty set).

Now the claim I want to prove:

    The array will remain the same if and only if 
the number of Destructive swap == Constructive swap in the iterations.

If someone has a counter-example, please write it down as a comment.

If this claim is correct, we can take all combinations and sum them - 0 harmless swaps, 1 harmless swaps,..,N harmless swaps.

And for each possible k harmless swap, we check if N-k is even, if no, we skip. If yes, we take (N-k)/2 for destructive, and (N-k) for constructive. And just look all possibilities.

Your reasoning is wrong, since we swap N times, i.e. the probability would be 1 / N^2. But that is also wrong due to a combination of factors, some of which at least have already been explained in other answers. — Konrad Rudolph, Aug 09 '12 at 14:20
@KonradRudolph where did you see I wrote the probability is 1/N^2? I am sure what I wrote is correct. — barak1412, Aug 09 '12 at 14:24
May you explain why? (what part is wrong) I am pretty sure it is not. — barak1412, Aug 09 '12 at 14:30
Doesn’t my first comment explain what’s wrong? The probability of n = m is 1/N for *one* swap. You do that N times, it needs to be the same every single time, which yields 1/N * 1/N = 1/N^2. Basically, the calculation of P2 in your answer is wrong. Just intuitively it should be obvious that the chance of preserving the original array should be much, *much* smaller than 1 in N. — Konrad Rudolph, Aug 09 '12 at 14:37
Wait, the reasoning in my comment above is wrong, you cannot multiply the probabilities since they are conditional but your formula for P2 is still wrong, and the intuition still holds. — Konrad Rudolph, Aug 09 '12 at 14:42
Where I wrote that the probability that n = m in more than one swap = 1/N. I think you didn't understand what I wrote at all. I will add explanation. — barak1412, Aug 09 '12 at 14:42
@KonradRudolph Look at my explanation for P2 - it is simple probability tree with independent events. — barak1412, Aug 09 '12 at 14:52
let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/15131/discussion-between-barak1412-and-konrad-rudolph) — barak1412, Aug 09 '12 at 15:12

user1443778 · Answer 11 · 2012-08-13T07:17:02.403

1

I would model the problem as a multigraph where nodes are elements of the array and swaps is adding an un-directed(!) connection between them. Then look for loops somehow (all nodes is a part of a loop => original)

Really need to get back to work! :(

edited Aug 13 '12 at 07:17

answered Aug 13 '12 at 07:08

user1443778

581
1
5
20

Mokhtar Ashour · Answer 12 · 2012-08-15T03:07:27.893

1

well, from mathematical perspective:

to have the array elements swapped at the same place every time, then the Rand(N) function must generate the same number twice for int m, and int n. so the probability that the Rand(N) function generates the same number twice is 1/N. and we have Rand(N) called N times inside the for loop, so we have probability of 1/(N^2)

edited Aug 15 '12 at 03:07

answered Aug 15 '12 at 03:00

Mokhtar Ashour

600
2
9
21

score 1 · Answer 13 · answered Aug 21 '12 at 16:06

Naive implementation in C#. The idea is to create all the possible permutations of initial array and enumerate them. Then we build a matrix of possible changes of state. Multiplying matrix by itself N times we will get the matrix showing how many ways exists that lead from permutation #i to permutation #j in N steps. Elemet [0,0] will show how many ways will lead to the same initial state. Sum of elements of row #0 will show total number of different ways. By dividing former to latter we get the probability.

In fact total number of permutations is N^(2N).

Output:
For N=1 probability is 1 (1 / 1)
For N=2 probability is 0.5 (8 / 16)
For N=3 probability is 0.1851851851851851851851851852 (135 / 729)
For N=4 probability is 0.068359375 (4480 / 65536)
For N=5 probability is 0.0193664 (189125 / 9765625)
For N=6 probability is 0.0057198259072973293366526105 (12450816 / 2176782336)

class Program
{
    static void Main(string[] args)
    {
        for (int i = 1; i < 7; i++)
        {
            MainClass mc = new MainClass(i);
            mc.Run();
        }
    }
}

class MainClass
{
    int N;
    int M;

    List<int> comb;
    List<int> lastItemIdx;
    public List<List<int>> combinations;
    int[,] matrix;

    public MainClass(int n)
    {
        N = n;

        comb = new List<int>();
        lastItemIdx = new List<int>();
        for (int i = 0; i < n; i++)
        {
            comb.Add(-1);
            lastItemIdx.Add(-1);
        }

        combinations = new List<List<int>>();
    }

    public void Run()
    {
        GenerateAllCombinations();
        GenerateMatrix();
        int[,] m2 = matrix;
        for (int i = 0; i < N - 1; i++)
        {
            m2 = Multiply(m2, matrix);
        }

        decimal same = m2[0, 0];
        decimal total = 0;
        for (int i = 0; i < M; i++)
        {
            total += m2[0, i];
        }

        Console.WriteLine("For N={0} probability is {1} ({2} / {3})", N, same / total, same, total);
    }

    private int[,] Multiply(int[,] m2, int[,] m1)
    {
        int[,] ret = new int[M, M];
        for (int ii = 0; ii < M; ii++)
        {
            for (int jj = 0; jj < M; jj++)
            {
                int sum = 0;

                for (int k = 0; k < M; k++)
                {
                    sum += m2[ii, k] * m1[k, jj];
                }

                ret[ii, jj] = sum;
            }
        }

        return ret;
    }

    private void GenerateMatrix()
    {
        M = combinations.Count;
        matrix = new int[M, M];

        for (int i = 0; i < M; i++)
        {
            matrix[i, i] = N;
            for (int j = i + 1; j < M; j++)
            {
                if (2 == Difference(i, j))
                {
                    matrix[i, j] = 2;
                    matrix[j, i] = 2;
                }
                else
                {
                    matrix[i, j] = 0;
                }
            }
        }
    }

    private int Difference(int x, int y)
    {
        int ret = 0;
        for (int i = 0; i < N; i++)
        {
            if (combinations[x][i] != combinations[y][i])
            {
                ret++;
            }

            if (ret > 2)
            {
                return int.MaxValue;
            }
        }

        return ret;
    }

    private void GenerateAllCombinations()
    {
        int placeAt = 0;
        bool doRun = true;
        while (doRun)
        {
            doRun = false;
            bool created = false;

            for (int i = placeAt; i < N; i++)
            {
                for (int j = lastItemIdx[i] + 1; j < N; j++)
                {
                    lastItemIdx[i] = j; // remember the test

                    if (comb.Contains(j))
                    {
                        continue; // tail items should be nulled && their lastItemIdx set to -1
                    }

                    // success
                    placeAt = i;
                    comb[i] = j;
                    created = true;
                    break;
                }

                if (comb[i] == -1)
                {
                    created = false;
                    break;
                }
            }

            if (created)
            {
                combinations.Add(new List<int>(comb));
            }

            // rollback 
            bool canGenerate = false;
            for (int k = placeAt + 1; k < N; k++)
            {
                lastItemIdx[k] = -1;
            }

            for (int k = placeAt; k >= 0; k--)
            {
                placeAt = k;
                comb[k] = -1;

                if (lastItemIdx[k] == N - 1)
                {
                    lastItemIdx[k] = -1;
                    continue;
                }

                canGenerate = true;
                break;
            }

            doRun = canGenerate;
        }
    }
}

Jason · Answer 14 · 2012-08-08T22:14:19.813

0

The probability that m==n on each iteration, then do that N times. P(m==n) = 1/N. So I think P=1/(n^2) for that case. But then you have to consider the values getting swapped back. So I think the answer is (text editor got me) 1/N^N.

edited Aug 08 '12 at 22:14

answered Aug 08 '12 at 21:41

Jason

674
4
6

1

It's definitely higher than `(1/N)^N`. Thing is, that by itself is the probability that the array never actually changes, we still haven't counted the probability that things get moved around and just happen to return back to the original state. – Dennis Meng Aug 08 '12 at 22:31

score 0 · Answer 15 · answered Aug 09 '12 at 05:59

Question: what is the probability that array A remains the same? Condition: Assume that the array contains unique elements.

Tried the solution in Java.

Random swapping happens on primitive int array. In java method parameters are always passed by value so what happens in swap method does not matter as a[m] and a[n] elements of the array (from below code swap(a[m], a[n]) ) are passed not complete array.

The answer is array will remain same. Despite of condition mentioned above. See below java code sample:

import java.util.Random;

public class ArrayTrick {

    int a[] = new int[10];
    Random random = new Random();

    public void swap(int i, int j) {
        int temp = i;
        i = j;
        j = temp;
    }

    public void fillArray() {
        System.out.println("Filling array: ");
        for (int index = 0; index < a.length; index++) {
            a[index] = random.nextInt(a.length);
        }
    }

    public void swapArray() {
        System.out.println("Swapping array: ");
        for (int index = 0; index < a.length; index++) {
            int m = random.nextInt(a.length);
            int n = random.nextInt(a.length);
            swap(a[m], a[n]);
        }
    }

    public void printArray() {
        System.out.println("Printing array: ");
        for (int index = 0; index < a.length; index++) {
            System.out.print(" " + a[index]);
        }
        System.out.println();
    }

    public static void main(String[] args) {
        ArrayTrick at = new ArrayTrick();

        at.fillArray();
        at.printArray();
        at.swapArray();
        at.printArray();
    }
}

Sample output:

Filling array: Printing array: 3 1 1 4 9 7 9 5 9 5 Swapping array: Printing array: 3 1 1 4 9 7 9 5 9 5

swap(i,j) can't possibly affect anything, as the integers are passed by value — Judge Mental, Aug 09 '12 at 06:02
Only because of your choice of language and implementation of `swap` — OrangeDog, Aug 09 '12 at 14:14