0

Is it possible to shuffle a list of python tuples whilst keeping the ordering of the second element.

Input:

[(1,1), (1,2), (1,3), (2,2), (2,3), (3,1), (3,2), (3,3)]

Example Output:

[(3,1), (1,1), (2,2), (3,2), (2, 3), (1,2), (1,3), (3,3)]

Essentially, I would like to randomize the first element keeping the ordering of the second element. As an example:

np.random.seed(41)
x =  [(1,1), (1,2), (1,3), (2,2), (2,3), (3,1), (3,2), (3,3)]
np.random.shuffle(x)

Would output

[(3, 2), (3, 1), (1, 2), (3, 3), (1, 3), (2, 3), (2, 2), (1, 1)]

With no ordering on the second element and not what is desired.

Leon Adams
  • 491
  • 4
  • 10

2 Answers2

3

You could use random:

import random
l = [(1,1), (1,2), (1,3), (2,2), (2,3), (3,1), (3,2), (3,3)]
print(sorted(l, key=lambda x: (x[1], random.random())))
Andreas
  • 8,694
  • 3
  • 14
  • 38
  • There is absolutely no way that this can produce their example output, which has `(2, 3)` before `(1,2)`. It also doesn't "keep" any ordering, which the question asks for. – Kelly Bundy Aug 24 '21 at 01:10
  • @KellyBundy, why does it not "keep" any ordering? Please elaborate. The solution, sorts the second element and shuffels the first one. "keeping" here means most likely the ascending sorting as shown in the example, where the `(2, 3)` is most likely a type in the sample data. – Andreas Aug 24 '21 at 01:27
  • "keeping" means there was some order before, and that it remains afterwards. Your code disregards any order it might've had before. – Kelly Bundy Aug 24 '21 at 01:34
  • This solution cannot produce all valid random orders. For example, if the 2nd elements of tuples are all distinct, this will always return the same result with no randomness. Even in the example, you will never see (2,2) appear at the last position because it will always appear before (1,3) and (3,3). – Alain T. Aug 24 '21 at 02:21
2

If the order of the 2nd elements is arbitrary (i.e. not always increasing) a sort will not be sufficient. One way to do it would be to keep track of the 2nd element order in a dictionary and shuffle the (repeating) 1st elements, then sequentially map the second element to the result using the dictionary. An iterator can be stored in the dictionary to allow using the next() function for the 2nd part of the tuples.

T = [(1,1), (1,2), (1,3), (2,2), (2,3), (3,1), (3,2), (3,3)]

from random import sample

groups = {g:iter([t for t in T if t[0]==g]) for g in dict(T)}
result = [next(groups[v0]) for v0,_ in sample(T,len(T))]

print(result)

[(3, 1), (3, 2), (1, 1), (3, 3), (2, 2), (1, 2), (1, 3), (2, 3)]

How it works is by first building a dictionary of the tuples grouped by their fist element and preserving the original order in the associated list.

{ 
  1: [(1,1), (1,2), (1,3)],
  2: [(2,2), (2,3)],
  3: [(3,1), (3,2), (3,3)]
}

Then the whole list of tuples is shuffled (using the sample() function).

[(3, 3), (3, 2), (1, 2), (3, 1), (2, 2), (1, 3), (1, 2), (2, 3)]

This does not preserve the order of the second element so we replace each tuple in the shuffled list with the one corresponding to its 1st element in the dictionary, matching the original order. In the end, shuffling the tuples only aims to shuffle their 1st element given that the 2nd element in the shuffled order is ignored.

   [(3, 3), (3, 2), (1, 2), (3, 1), (2, 2), (1, 3), (1, 2), (2, 3)] # shuffled
     :  –    :  –    :  –    :  –    :  –    :  –    :  –    :  –
     :       :       :       :       :       :       :       :  
1: [ :       :      (1, 1),  :       :      (1, 2), (1, 3)]  :      # 
2: [ :       :          |    :      (2, 2),     |       |   (2, 3)] # groups
3: [(3, 1), (3, 2),     |   (3, 3)]     |       |       |       |   #
        |       |       |       |       |       |       |       |
        v       v       v       v       v       v       v       v
   [(3, 1), (3, 2), (1, 1), (3, 3), (2, 2), (1, 2), (1, 3), (2, 3)] # result

The iter() function is merely a trick to process the mapping inside a list comprehension. You could do the same thing using groups[v0].pop(0) to consume the grouped lists in the original order (although that would likely be less efficient)

Special thanks to Kelly Bundy for helping debug and optimize the solution

Alain T.
  • 40,517
  • 4
  • 31
  • 51
  • This could really be what they mean. Explains why they talk about "keeping" some order, why (2,3) can end up before (1,2), and now their example *input* also makes more sense. The only thing I see that speaks against your interpretation is that they accepted the other answer :-) – Kelly Bundy Aug 24 '21 at 02:24
  • Accepted for this use case as the order of the elements were always increasing. I can see how this can be a solution for the more general use case as indicated. – Leon Adams Aug 24 '21 at 02:27
  • @LeonAdams The answer you accepted is incompatible with your question. It cannot create the desired output example you gave, with (2,3) ending up before (1,2). Either you made a mistake in writing that example, or you made a mistake in accepting that answer. Which one is it? – Kelly Bundy Aug 24 '21 at 03:08
  • @Alain The `iter` is redundant, it just gives you the generator iterator you feed it. And you could save some code and memory by reusing the original tuples instead of taking them apart and recreating them. – Kelly Bundy Aug 24 '21 at 04:07
  • Since i'm shuffling the 1st values independently of the second ones, I kinda need to reassemble the tuples. I didn't find a way to do it without creating that separation and reassembly – Alain T. Aug 24 '21 at 04:23
  • Found a bug. Try it after replacing (3,3) with (4,1). It crashes. – Kelly Bundy Aug 24 '21 at 04:23
  • [This](https://tio.run/##fY7NCsMgEITvPsVCLwoekngphbyFtxBKICYNGBVj0pbSZ0@NS/@g9DTuODvfums4WSP2zq@rhBIqmvOccYhSoIhNCpwKnARGBJoimjUhnbcj@Ma0UYbRWR9gakanFSE7sLolvbezmyLj1h/okkNnPSwZj6/BgIShi1NZ9ix99PyY7Dvxapp1SKfFtFGXQLGpWrKasWdNiiOQSq6VoZJtZ@3AqPM3O6Sd8MKGKqv/gj@hW7hm74YfSOcHEyius3V9AA) is what I meant with reusing the tuples, but it suffers from the same bug as yours. Just makes it more obvious. That's actually how I found your bug. – Kelly Bundy Aug 24 '21 at 04:27
  • You can also see it in your own output, the `(2, 1)` doesn't exist in the input. – Kelly Bundy Aug 24 '21 at 04:28
  • I fell into the "closure/capture" trap (again). The iterator associated to each key is built as "compile time" which means it is the same for all keys and ends up using the last group for all keys. I had to change it to a function call (lambda) to ensure that every iterator is distinct. more details on that trap here: https://stackoverflow.com/a/66341706/5237560 – Alain T. Aug 24 '21 at 04:32
  • Yeah. [Alternatively](https://tio.run/##LY7BCsMgEETvfsUeDXho4qUU8hfeJJRAjRUSlc0GWkq/3Wq2p8fMzs5uftMzRX3NWIqBEazsVd8pqBgYumFgNbDSHNFs6mpOQiyYNsA5PirClhMS7POWVyeEx3TkvbZ//C2QQ2kJloRAECIYCAuQvUzj6Kfu9L26n5OvQLcfK7W/onuR5CLb0v/oWcF3pFGri9J07ZuMIZLk9a6UHw), you could build lists and use `iter` again :-) – Kelly Bundy Aug 24 '21 at 04:34
  • Indeed, that makes it much simpler to read and understand. (the capture issue is a bit arcane and most people wouldn't readily see why i'm using a lambda) – Alain T. Aug 24 '21 at 04:37
  • Downside of course is that it's now Theta(n^2) instead of just O(n^2). Btw, better update your example output as well, still has that invalid (2,1). – Kelly Bundy Aug 24 '21 at 04:39
  • I believe running time will depend on the number of distinct groups (distinct 1st values). Something like n x g (where g is the number of groups). Worst case is n^2 but that's only when all 1st values are distinct. – Alain T. Aug 24 '21 at 05:02
  • Yes, now with that neat `dict(T)` trick it's "n x g". – Kelly Bundy Aug 24 '21 at 05:27
  • Don't actually follow exactly how this is all being accomplished, but this is a superior solution in generating more randomization in the tuples whist maintaining the ordering of the second element. – Leon Adams Aug 25 '21 at 13:03