Algorithm for Shuffling a Linked List in n log n time

Question

I'm trying to shuffle a linked list using a divide-and-conquer algorithm that randomly shuffles a linked list in linearithmic (n log n) time and logarithmic (log n) extra space.

I'm aware that I can do a Knuth shuffle similar to that could be used in a simple array of values, but I'm not sure how I would do this with divide-and-conquer. What I mean is, what am I actually dividing? Do I just divide to each individual node in the list and then randomly assemble the list back together using some random value?

Or do I give each node a random number and then do a mergesort on the nodes based on the random numbers?

Note: This is just an exercise from a book, it's not actually graded nor is any type of credit given. This is literally for my own development enrichment. — 5StringRyan, Aug 28 '12 at 21:24
You can't give every node a number in O(log n) extra storage space; this takes O(1) space for each number individually, so that takes a total of O(n) storage space. — templatetypedef, Aug 28 '12 at 21:27
But if you have O(n) storage, just copy it to an array, and use plain O(n) shuffle. — Karoly Horvath, Aug 28 '12 at 21:34
like this one: http://stackoverflow.com/questions/11309200/how-to-randomly-shuffle-a-linked-list-in-c — Karoly Horvath, Aug 28 '12 at 21:58
@templatetypedef: You can't _store a number for each node all at the same time_ in O(log n) extra storage space_, but he needs not store the numbers at the same time. — Mooing Duck, May 02 '14 at 18:01
The `Collections.sort` method in the JDK dumps the list into an array, does a Fisher-Yates shuffle, and then converts the array back to a list. It takes linear time and space. — Abhijit Sarkar, May 22 '18 at 08:16

foxcub · Accepted Answer · 2022-01-13T04:59:27.510

38

What about the following? Perform the same procedure as merge sort. When merging, instead of selecting an element (one-by-one) from the two lists in sorted order, flip a coin. Choose whether to pick an element from the first or from the second list based on the result of the coin flip.

Edit (2022-01-12): As GA1 points out in the answer below, this algorithm doesn't produce a permutation uniformly at random.

Algorithm.

shuffle(list):
    if list contains a single element
        return list

    list1,list2 = [],[]
    while list not empty:
        move front element from list to list1
        if list not empty: move front element from list to list2

    shuffle(list1)
    shuffle(list2)

    if length(list2) < length(list1):
        i = pick a number uniformly at random in [0..length(list2)]             
        insert a dummy node into list2 at location i 

    # merge
    while list1 and list2 are not empty:
        if coin flip is Heads:
            move front element from list1 to list
        else:
            move front element from list2 to list

    if list1 not empty: append list1 to list
    if list2 not empty: append list2 to list

    remove the dummy node from list

The key point for space is that splitting the list into two does not require any extra space. The only extra space we need is to maintain log n elements on the stack during recursion.

The point with the dummy node is to realize that inserting and removing a dummy element keeps the distribution of the elements uniform.

Edit (2022-01-12): As Riley points out in the comments, the analysis below is flawed.

Analysis. Why is the distribution uniform? After the final merge, the probability P_i(n) of any given number ending up in the position i is as follows. Either it was:

in the i-th place in its own list, and the list won the coin toss the first i times, the probability of this is 1/2^i;
in the i-1-st place in its own list, and the list won the coin toss i-1 times including the last one and lost once, the probability of this is (i-1) choose 1 times 1/2^i;
in the i-2-nd place in its own list, and the list won the coin toss i-2 times including the last one and lost twice, the probability of this is (i-1) choose 2 times 1/2^i;
and so on.

So the probability

P_i(n) = \sum_{j=0}^{i-1} (i-1 choose j) * 1/2^i * P_j(n/2).

Inductively, you can show that P_i(n) = 1/n. I let you verify the base case and assume that P_j(n/2) = 2/n. The term \sum_{j=0}^{i-1} (i-1 choose j) is exactly the number of i-1-bit binary numbers, i.e. 2^{i-1}. So we get

P_i(n) = \sum_{j=0}^{i-1} (i-1 choose j) * 1/2^i * 2/n
       = 2/n * 1/2^i * \sum_{j=0}^{i-1} (i-1 choose j)
       = 1/n * 1/2^{i-1} * 2^{i-1}
       = 1/n

I hope this makes sense. The only assumption we need is that n is even, and that the two lists are shuffled uniformly. This is achieved by adding (and then removing) the dummy node.

P.S. My original intuition was nowhere near rigorous, but I list it just in case. Imagine we assign numbers between 1 and n at random to the elements of the list. And now we run a merge sort with respect to these numbers. At any given step of the merge, it needs to decide which of the heads of the two lists is smaller. But the probability of one being greater than the other should be exactly 1/2, so we can simulate this by flipping a coin.

P.P.S. Is there a way to embed LaTeX here?

edited Jan 13 '22 at 04:59

answered Aug 28 '12 at 22:03

foxcub

2,517
2
27
27

@templatetypedef Just added an explanation. – foxcub Aug 30 '12 at 21:01
I don't see how this could meet the O() requirements. what's the exact algorithm? – Karoly Horvath Aug 30 '12 at 21:04
1

I admit I'm fuzzy on what happens if the number of items is not 2^k. – foxcub Aug 30 '12 at 21:05
On linked lists, merge sort works in place, not counting the log n stack. – foxcub Aug 30 '12 at 21:11
You have to maintain the links while doing this recursive process, how is that going to work? Could you write down how the algorithm works, step-by-step? – Karoly Horvath Aug 31 '12 at 09:01
I'm not sure what you are asking @KarolyHorvath but merge sort on linked lists requires only O(log n) extra space (in fact, it can be done in O(1) extra space). In any case, I've added an explicit algorithm, so we can discuss the specifics. – foxcub Aug 31 '12 at 15:47
1

I believe the algorithm is unchanged when n is not 2^k. To see why, add dummy nodes at the end of the list to pad to nearest 2^k and perform the algorithm. At the end, remove the dummy nodes from the shuffled list. The random list obtained this way seems to have the same distribution as if you performed the algorithm without padding. This needs to be checked thoroughly though. – Alexandre C. Aug 31 '12 at 15:53
@AlexandreC. I agree that one needs to simulate padding, and it's probably not difficult to do (since list1 and list2 differ by at most one element), but I don't agree that the algorithm does it as is. An easy way to see it is to consider a list with 3 elements. With the current algorithm, the probability of the third element ending up in the third space is 1/4. – foxcub Aug 31 '12 at 16:00
@foxcub: You're right obviously. I have another proposal: one can keep track of the number of elements in each list at merge step, and adjust the probabilities accordingly. This is still O(log n) space. – Alexandre C. Aug 31 '12 at 16:12
Actually, @AlexandreC., I think it's easy to exploit your padding trick at each iteration. Let me update the code. – foxcub Aug 31 '12 at 16:26
3

I don't see how your algorithm is going to produce a randomly uniform shuffled list. For example, given two randomly uniform shuffled lists in the merge, I don't understand how the second element (of the first list for example) can ever be in the first position of the result (assuming you insert front, invert otherwise). So, given two randomly uniform shuffled lists, your merge does not produce a randomly uniform shuffle of the two. If the 'random merge' does not work at any step, it certainly won't at the last one neither. – Aszarsha Mar 04 '14 at 18:05
1

The analysis is inductive. If you assume that the probability of any given element, ending up in any given position (in the half lists) is 1/(n/2), then the probability of it ending up in the first (or any other) place is 1/n. I think where your confusion appears is that you are thinking about the conditional probability, probability of an element ending up in the first place conditioned on a particular ordering of the two lists. You get 0 for that conditional probability, but that doesn't say much about the unconditional probability. – foxcub Mar 04 '14 at 20:40
1

But I think it does. Because the input list is not shuffled itself (hence the algorithm), the subdivided lists aren't either (one over two in first or second list), and once you enter the 'merge phases', you have to take care to properly *uniformly* shuffle. Hence the need to have proper conditional probability, that is, P(result | input) must equal 1/n!. Conditional probability which have to be propagated in the merge phase. An experience which demonstrate the bias is quite easy to put in place. – Aszarsha Mar 04 '14 at 23:14
I'm not sure I'm following. (And I'm sure I disagree.) What is this "experience which demonstrates the bias"? – foxcub Mar 05 '14 at 00:20
2

I created a little lua script that implements your solution (did not use your dummy nodes which won't add any value in regard to the randomness), you can look into it at: https://gist.github.com/Aszarsha/9359090. I left a comment with a run example. The resulting matrix represent, for each entry of the input list the number of times it appears at each position of the 'shuffled' list after the specified number of runs. The clear diagonal demonstrate the bias, whereas an unbiased shuffle would show a fairly uniform matrix. PS: How can you disagree when you are not even sure that you follow ? – Aszarsha Mar 05 '14 at 01:06
1

I've never used Lua, but shouldn't that be "l1 = foxcubShuffle(l1)", and same for l2? As for disagreeing: I understand the proof of correctness, and I understand the gap in your argument; I don't need to understand its every detail. – foxcub Mar 05 '14 at 03:54
2

I should also point out that dummy nodes are crucial for randomness. You can see it in your code: if you use a power of 2 number of elements, you get much more even counts. – foxcub Mar 05 '14 at 03:57
3

You are absolutely right of course, there is (was, it is now corrected) an obvious bug in the code, and the dummy nodes are indeed necessary for non power of two cases. I'm going to leave my comments anyway since I believe that the matrix experiment have it's merit in itself (even if it 'proves' the opposite of my though). – Aszarsha Mar 05 '14 at 13:37
@foxcub I have expended on you answer to remove the need for dummy nodes and I provided a 'shuffle on descent' version which *might* be faster sometimes too. – Aszarsha Mar 05 '14 at 18:58
@foxcub: I use this and download the equations as `GIF` and them add them to your answer: http://www.codecogs.com/latex/eqneditor.php – Moha the almighty camel Mar 05 '14 at 22:36
@foxcub: The algorithm that had popped into my mind was similar, but instead of putting odd-indexed elements in the first parition and even-indexed elements in the second partition, I put them in one of the partitions randomly. This would lead to ~50% more stack usage on average, degenerate to O(n^2) sometimes, but doesn't require dummy nodes. Actually, if one "randomly" picks which list to insert into based on the ratio of their sizes and the remaining elements such that the partition is always within one element of 50%, I think that would be faster... Hmm. – Mooing Duck May 02 '14 at 18:06
The "random based on the ratio" thing I just said is wrong and absurd, but if you select nodes randomly until one partition has half, then put the rest in the remaining partition, I'm pretty sure that works out to a random 50% distribution. Also far simpler. – Mooing Duck May 02 '14 at 18:39
1

I'm not sure whether your algorithm is truly uniform at random, but I take issue with your proof. It's one thing to say that every node is equally likely to end up in every spot, and it's a totally different thing to say every permutation is equally likely. For example, I could invent a dumber algorithm that just cyclically shifts the list by a random offset, and any given node has probability 1/n of ending up in any spot. But obviously this doesn't uniformly shuffle the array. – Riley Jan 12 '22 at 00:05
@Riley You are absolutely right. This needs fixing. I'll add a note to the answer. – foxcub Jan 13 '22 at 04:42
In fact, I now see the answer by @GA1 below that makes it clear that the algorithm is wrong too. What's the proper procedure? Should I delete this answer? – foxcub Jan 13 '22 at 04:57
@foxcub I think there's still value in keeping this answer because it's a natural algorithm to try, the comments are valuable, and other answers refer to yours. Now that your answer mentions the mistake and recommends the better answer I think it's fine how you've updated it. – Riley Jan 13 '22 at 05:30

Aszarsha · Answer 2 · 2014-03-05T22:29:29.240

Code

Up shuffle approach

This (lua) version is improved from foxcub's answer to remove the need of dummy nodes.

In order to slightly simplify the code in this answer, this version suppose that your lists know their sizes. In the event they don't, you can always find it in O(n) time, but even better: a few simple adaptation in the code can be done to not require to compute it beforehand (like subdividing one over two instead of first and second half).

function listUpShuffle (l)
    local lsz = #l
    if lsz <= 1 then return l end

    local lsz2 = math.floor(lsz/2)
    local l1, l2 = {}, {}
    for k = 1, lsz2     do l1[#l1+1] = l[k] end
    for k = lsz2+1, lsz do l2[#l2+1] = l[k] end

    l1 = listUpShuffle(l1)
    l2 = listUpShuffle(l2)

    local res = {}
    local i, j = 1, 1
    while i <= #l1 or j <= #l2 do
        local rem1, rem2 = #l1-i+1, #l2-j+1
        if math.random() < rem1/(rem1+rem2) then
            res[#res+1] = l1[i]
            i = i+1
        else
            res[#res+1] = l2[j]
            j = j+1
        end
    end
    return res
end

To avoid using dummy nodes, you have to compensate for the fact that the two intermediate lists can have different lengths by varying the probability to choose in each list. This is done by testing a [0,1] uniform random number against the ratio of nodes popped from the first list over the total number of node popped (in the two lists).

Down shuffle approach

You can also shuffle while you subdivide recursively, which in my humble tests showed slightly (but consistently) better performance. It might come from the fewer instructions, or on the other hand it might have appeared due to cache warmup in luajit, so you will have to profile for your use cases.

function listDownShuffle (l)
    local lsz = #l
    if lsz <= 1 then return l end

    local lsz2 = math.floor(lsz/2)
    local l1, l2 = {}, {}
    for i = 1, lsz do
        local rem1, rem2 = lsz2-#l1, lsz-lsz2-#l2
        if math.random() < rem1/(rem1+rem2) then
            l1[#l1+1] = l[i]
        else
            l2[#l2+1] = l[i]
        end
    end

    l1 = listDownShuffle(l1)
    l2 = listDownShuffle(l2)

    local res = {}
    for i = 1, #l1 do res[#res+1] = l1[i] end
    for i = 1, #l2 do res[#res+1] = l2[i] end
    return res
end

Tests

The full source is in my listShuffle.lua Gist.

It contains code that, when executed, prints a matrix representing, for each element of the input list, the number of times it appears at each position of the output list, after a specified number of run. A fairly uniform matrix 'show' the uniformity of the distribution of characters, hence the uniformity of the shuffle.

Here is an example run with 1000000 iteration using a (non power of two) 3 element list :

>> luajit listShuffle.lua 1000000 3
Up shuffle bias matrix:
333331 332782 333887
333377 333655 332968
333292 333563 333145
Down shuffle bias matrix:
333120 333521 333359
333435 333088 333477
333445 333391 333164

run up, up, up, down, down, down, up, up, up, down, down, down. That should show if lua warmup or caching is interfering, and roughly how much. — Mooing Duck, May 02 '14 at 19:05

Uri Granta · Answer 3 · 2015-03-02T15:07:30.980

You can actually do better than that: the best list shuffle algorithm is O(n log n) time and just O(1) space. (You can also shuffle in O(n) time and O(n) space by constructing a pointer array for the list, shuffling it in place using Knuth and re-threading the list accordingly.)

Complexity proof

To see why O(n log n) time is minimal for O(1) space, observe that:

With O(1) space, updating the successor of an arbitrary list element necessarily takes O(n) time.
Wlog, you can assume that whenever you update one element, you also update all the other elements (leaving them unchanged if you wish), as this also takes just O(n) time.
With O(1) space, there are at most O(1) elements to choose from for the successor of any element you're updating (which specific elements these are will obviously depend on the algorithm).
Therefore, a single O(n) time update of all the elements could result in at most c^n different list permutations.
Since there are n! = O(n^n) = O(c^(n log n)) possible list permutations, you require at least O(log n) updates.

Linked-list data structure (because Python)

import collections

class Cons(collections.Sequence):
    def __init__(self, head, tail=None):
        self.head = head
        self.tail = tail

    def __getitem__(self, index):
        current, n = self, index
        while n > 0:
            if isinstance(current, Cons):
                current, n = current.tail, n - 1
            else:
                raise ValueError("Out of bounds index [{0}]".format(index))
        return current

    def __len__(self):
        current, length = self, 0
        while isinstance(current, Cons):
            current, length = current.tail, length + 1
        return length

    def __repr__(self):
        current, rep = self, []
        while isinstance(current, Cons):
            rep.extend((str(current.head), "::"))
            current = current.tail
        rep.append(str(current))
        return "".join(rep)

Merge-style algorithm

Here is an O(n log n) time and O(1) space algorithm based on iterative merge sort. The basic idea is simple: shuffle the left half, then the right half, then merge them by randomly selecting from the two lists. Two things worth noting:

By making the algorithm iterative rather than recursive, and returning a pointer to the new last element at the end of every merge step, we reduce the space requirement to O(1) while keeping the time cost minimal.
To make sure that all possibilities are equally likely for all input sizes, we use probabilities from the Gilbert–Shannon–Reeds model riffle shuffle when merging (see http://en.wikipedia.org/wiki/Gilbert%E2%80%93Shannon%E2%80%93Reeds_model).

import random

def riffle_lists(head, list1, len1, list2, len2):
    """Riffle shuffle two sublists in place. Returns the new last element."""
    for _ in range(len1 + len2):
        if random.random() < (len1 / (len1 + len2)):
            next, list1, len1 = list1, list1.tail, len1 - 1
        else:
            next, list2, len2 = list2, list2.tail, len2 - 1
        head.tail, head = next, next
    head.tail = list2
    return head

def shuffle_list(list):
    """Shuffle a list in place using an iterative merge-style algorithm."""
    dummy = Cons(None, list)
    i, n = 1, len(list)
    while (i < n):
        head, nleft = dummy, n
        while (nleft > i):
            head = riffle_lists(head, head[1], i, head[i + 1], min(i, nleft - i))
            nleft -= 2 * i
        i *= 2
    return dummy[1]

Another algorithm

Another interesting O(n log n) algorithm that produces not-quite-uniform shuffles involves simply riffle shuffling the list 3/2 log_2(n) times. As described in http://en.wikipedia.org/wiki/Gilbert%E2%80%93Shannon%E2%80%93Reeds_model, this leaves only a constant number of bits of information.

GA1 · Answer 4 · 2015-07-15T15:14:59.620

I'd say, that foxcub's answer is wrong. To prove that I will introduce a helpful definition for a perfectly shuffled list (call it array or sequence or whatever you want).

Definition: Assume we have a List L containing the elements a1, a2 ... an and the indexes 1, 2, 3..... n. If we expose the L to a shuffle operation (to which internals we have no access) L is perfectly shuffled if and only if by knowing indexes of some k (k< n) elements we can't deduce the indexes of remaining n-k elements. That is the remaining n-k elements are equally probable to be revealed at any of the remaining n-k indexes.

Example: if we have a four element list [a, b, c, d] and after shuffling it, we know that its first element is a ([a, .., .., ..]) than the probability for any of the elements b, c, d to occur in, let's say, the third cell equals 1/3.

Now, the smallest list for which the algorithm does not fulfil the definition has three elements. But the algorithm converts it to a 4-element list anyway, so we will try to show its incorrectness for a 4-element list.

Consider an input L = [a, b, c, d]Following the first run of the algorithm the L will be divided into l1 = [a, c] and l2 = [b, d]. After shuffling these two sublists (but before merging into the four-element result) we can get four equally probable 2-elements lists:

l1shuffled = [a , c]     l2shuffled = [b , d]
l1shuffled = [a , c]     l2shuffled = [d , b]
l1shuffled = [c , a]     l2shuffled = [b , d]
l1shuffled = [c , a]     l2shuffled = [d , b]

Now try to answer two questions.
1. What is the probability that after merging into the final result a will be the first element of the list.
Simply enough, we can see that only two of the four pairs above (again, equally probable) can give such a result (p1 = 1/2). For each of these pairs heads must be drawed during first flipping in the merge routine (p2 = 1/2). Thus the probability for having a as the first element of the Lshuffled is p = p1*p2 = 1/4, which is correct.

2. Knowing that a is on the first position of the Lshuffled, what is the probability of having c (we could as well choose b or d without loss of generality) on the second position of the Lshuffled
Now, according to the above definition of a perfectly shuffled list, the answer should be 1/3, since there are three numbers to put in the three remaining cells in the list
Let's see if the algorithm assures that.
After choosing 1 as the first element of the Lshuffled we would now have either:
l1shuffled = [c] l2shuffled = [b, d]
or:
l1shuffled = [c] l2shuffled = [d, b]
The probability of choosing 3 in both cases is equal to the probability of flipping heads (p3 = 1/2), thus the probability of having 3 as the second element of Lshuffled, when knowing that the first element element of Lshuffled is 1 equals 1/2. 1/2 != 1/3 which ends the proof of the incorrectness of the algorithm.

The interesting part is that the algorithm fullfils the necessary (but not sufficient) condition for a perfect shuffle, namely:

Given a list of n elements, for every index k (<n), for every element ak: after shuffling the list m times, if we have counted the times when ak occured on the k index, this count will tend to m/n by probability, with m tending to infinity.

Indeed. Which is why the merging needs to select elements using the probabilities based on the length of the lists, as in my answer. — Uri Granta, Mar 01 '15 at 22:04

score 1 · Answer 5 · answered Oct 26 '16 at 01:26

Here is one possible solution:

#include <stdlib.h>

typedef struct node_s {
   struct node_s * next;
   int data;
} node_s, *node_p;

void shuffle_helper( node_p first, node_p last ) {
   static const int half = RAND_MAX / 2;
   while( (first != last) && (first->next != last) ) {
      node_p firsts[2] = {0, 0};
      node_p *lasts[2] = {0, 0};
      int counts[2] = {0, 0}, lesser;
      while( first != last ) {
         int choice = (rand() <= half);
         node_p next = first->next;
         first->next = firsts[choice];
         if( !lasts[choice] ) lasts[choice] = &(first->next);
         ++counts[choice];
         first = next;
      }

      lesser = (counts[0] < counts[1]);

      if( !counts[lesser] ) {
         first = firsts[!lesser];
         *(lasts[!lesser]) = last;
         continue;
      }

      *(lasts[0]) = firsts[1];
      *(lasts[1]) = last;

      shuffle_helper( firsts[lesser], firsts[!lesser] );

      first = firsts[!lesser];
      last = *(lasts[!lesser]);
   }
}

void shuffle_list( node_p thelist ) { shuffle_helper( thelist, NULL ); }

This is basically quicksort, but with no pivot, and with random partitioning.

The outer while loop replaces a recursive call.

The inner while loop randomly moves each element into one of two sublists.

After the inner while loop, we connect the sublists to one another.

Then, we recurse on the smaller sublist, and loop on the larger.

Since the smaller sublist can never be more than half the size of the initial list, the worst case depth of recursion is the log base two of the number of elements. The amount of memory needed is O(1) times the depth of recursion.

The average runtime, and number of calls to rand() is O(N log N).

More precise runtime analysis requires an understanding of the phrase "almost surely."

score 0 · Answer 6 · answered Sep 07 '16 at 05:13

0

Bottom up merge sort without compares. while merging don't do any comparison just swap the elements.

answered Sep 07 '16 at 05:13

Naveen

51
4

score 0 · Answer 7 · answered May 25 '20 at 04:05

0

You could traverse over the list, randomly generating 0 or 1 at each node.

If it is 1, remove the node and place it as the first node of the list. If its is 0, do nothing.

loop this until you reach the end of the list.

answered May 25 '20 at 04:05

Johny Thomas Kariath

1
1

Algorithm for Shuffling a Linked List in n log n time

7 Answers7

Code

Tests

Linked