1

I'm using the shuffle() method of sklearn to shuffle the rows of m x n x o matrix and to shuffle a m x 1 vector:

from sklearn.utils import shuffle
import numpy as np
X = np.random.rand(10,4,3)
y = np.random.rand(10)
X, y = shuffle(X, y, random_state=1)

Is there a way to unshuffle the data, i.e. reverse the shuffling? I cannot store both the shuffled and unshuffled data because X is pretty large (in the example above it is small).

machinery
  • 5,972
  • 12
  • 67
  • 118

1 Answers1

2

Here's one way you could shuffle things and then unshuffle them:

import random


def getperm(l):
    seed = sum(sum(a) for a in l)
    random.seed(seed)
    perm = list(range(len(l)))
    random.shuffle(perm)
    random.seed()  # optional, in order to not impact other code based on random
    return perm


def shuffle(l):
    perm = getperm(l)
    l[:] = [l[j] for j in perm]


def unshuffle(l):
    perm = getperm(l)
    res = [None] * len(l)
    for i, j in enumerate(perm):
        res[j] = l[i]
    l[:] = res

example call of the functions:

l=[(1,2),(3,4),(5,6),(7,8),(9,10)]
print(l)
shuffle(l)
print(l)  # shuffled
unshuffle(l)
print(l)  # the original

output:

[(1, 2), (3, 4), (5, 6), (7, 8), (9, 10)]

[(5, 6), (7, 8), (9, 10), (3, 4), (1, 2)]

[(1, 2), (3, 4), (5, 6), (7, 8), (9, 10)]
Maxijazz
  • 175
  • 1
  • 11