Shuffle 4-dimensional time-series by voxel

Question

I have a 4-dimensional array, which is a time-series of 3-dimensional arrays. I would like to shuffle each point in the 3-dimensional arrays along the time axis. Here is code I wrote to do this using nested for loops. Can this be done with fancy numpy indexing? Speed is a factor. Thank you.

import numpy as np

timepoints = 2
x = 4
y = 4
z = 3

vol_1 = np.zeros((x, y, z))
vol_2 = np.ones((x, y, z))
timeseries = np.array((vol_1, vol_2))

timeseries.shape  # (2, 4, 4, 3)

# One voxel over time.
timeseries[:, 0, 0, 0]

for xx in range(x):
    for yy in range(y):
        for zz in range(z):
            np.random.shuffle(timeseries[:, xx, yy, zz])

I noticed that `np.random.shuffle` only shuffles first dimension, was wandering whether some wise `swapaxes` / `reshape` combination would do the trick, that would be cool (and probably faster). — norok2, Sep 28 '17 at 18:51
@Jakub For example, does a single call to `np.random.shuffle(timeseries.reshape(2, -1))` in place of the three loops works for you? Or maybe the other way around (I am not sure to fully understand what are you trying to achieve): `np.random.shuffle(timeseries.reshape(2, -1).swapaxes(0, -1))` — norok2, Sep 28 '17 at 19:05
@norok2 Since, shuffle works along one axis, so with multi-dim arrays, the elements are shuffled in blocks, not what OP wants here. I have tried to explain on what's going on with this problem in my post, if that helps. — Divakar, Sep 28 '17 at 19:20

Divakar · Accepted Answer · 2017-09-28T19:30:59.783

We could generate all the shuffled indices along the first axis and then simply use advanced-indexing to get the randomized version. Now, to get those all shuffled indices, we could generate a random array of the same shape as the input array and get the argsort indices along the first axis. This has been explored before, as here.

Hence, we would have a vectorized implementation like so -

m,n,r,p = a.shape # a is the input array
idx = np.random.rand(*a.shape).argsort(0)
out = a[idx, np.arange(n)[:,None,None], np.arange(r)[:,None], np.arange(p)]

Just to explain to the readers on what exactly is the problem, here's a sample run -

1) Input 4D array :

In [711]: a
Out[711]: 
array([[[[60, 22, 34],
         [29, 18, 79]],

        [[11, 69, 41],
         [75, 30, 30]]],


       [[[63, 61, 42],
         [70, 56, 57]],

        [[70, 98, 71],
         [29, 93, 96]]]])

2) Random indices generated with proposed method for indexing along the first axis :

In [712]: idx
Out[712]: 
array([[[[1, 0, 1],
         [0, 1, 1]],

        [[0, 0, 1],
         [1, 0, 1]]],


       [[[0, 1, 0],
         [1, 0, 0]],

        [[1, 1, 0],
         [0, 1, 0]]]])

3) Finally index into input array for shuffled output :

In [713]: out
Out[713]: 
array([[[[63, 22, 42],
         [29, 56, 57]],

        [[11, 69, 71],
         [29, 30, 96]]],


       [[[60, 61, 34],
         [70, 18, 79]],

        [[70, 98, 41],
         [75, 93, 30]]]])

Looking closely, we will see that 63 at a[0,0,0,0] and 60 at a[1,0,0,0] are swapped on account of the idx values being 1 and 0 respectively at those corresponding places in idx. Next up, 22 and 61 stay at their places, since idx values are 0 and 1 and so on.

Runtime test

In [726]: timeseries = np.random.rand(10,10,10,10)

In [727]: %timeit org_app(timeseries)
100 loops, best of 3: 5.24 ms per loop

In [728]: %timeit proposed_app(timeseries)
1000 loops, best of 3: 289 µs per loop

In [729]: timeseries = np.random.rand(50,50,50,50)

In [730]: %timeit org_app(timeseries)
1 loop, best of 3: 720 ms per loop

In [731]: %timeit proposed_app(timeseries)
1 loop, best of 3: 426 ms per loop

At large sizes, the cost of creating random array is proving to be the bottleneck with the proposed method, but still shows a good speedup over the original loopy version.

norok2 · Answer 2 · 2017-09-29T14:25:38.223

I am adding this as an answer, because it wouldn't fit in the comments, by it is only a minor addition on top of @Divakar's excellent answer:

def divakar(a):
    m,n,r,p = a.shape # a is the input array
    idx = np.random.rand(*a.shape).argsort(0)
    return a[idx, np.arange(n)[:,None,None], np.arange(r)[:,None], np.arange(p)]

a = np.random.rand(50,50,50,50)
%timeit divakar(a)
560 ms ± 2.62 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

I have observed some speedups by multiple use of reshaping instead of broadcasting, like:

def norok2(a):
    shape = a.shape
    idx = np.random.rand(*a.shape).argsort(0).reshape(shape[0], -1)
    return a.reshape(shape[0], -1)[idx, np.arange(shape[1] * shape[2] * shape[3])].reshape(shape)

a = np.random.rand(50,50,50,50)
%timeit norok2(a)
495 ms ± 1.61 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

as compared to OP's proposal:

def jakub(a):
    t, x, y, z = a.shape
    for xx in range(x):
        for yy in range(y):
            for zz in range(z):
                np.random.shuffle(a[:, xx, yy, zz])


%timeit jakub(a)
2 s ± 30.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Incidentally, my proposed modification is easier to extend to n-dimensional arrays and arbitrary shuffling axis, e.g.:

import numpy as np
import functools

def shuffle_axis(arr, axis=0):
    arr = np.swapaxes(arr, 0, axis)
    shape = arr.shape
    i = np.random.rand(*shape).argsort(0).reshape(shape[0], -1)
    return arr.reshape(shape[0], -1)[i, np.arange(functools.reduce(lambda x, y: x * y, shape[1:]))].reshape(shape).swapaxes(axis, 0)

with similar speeds:

a = np.random.rand(50,50,50,50)
%timeit shuffle_axis(a)
499 ms ± 2.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

EDIT revisited

...and the timings are not terribly worse than randomizing everything:

a = np.random.rand(50,50,50,50)
%timeit np.random.shuffle(a.ravel())
310 ms ± 1.84 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

which should be some sort of lower bound on the performances of any solution to this problem (but it does NOT solve the OP question).

Nope that `np.random.shuffle(a.ravel())` is just wrong. OP still wants to keep the shuffling along the first axis, which is lost with the flattening. — Divakar, Sep 29 '17 at 04:28
I added that only for speed comparison, assuming any solution would approach but never beat the speed of that, without rewriting this in C(ython). — norok2, Sep 29 '17 at 05:19
Well, then you should qualify it by stating that's not solving the problem and it meant just to get the limits on performance. Future readers or even OP won't know whether it actually solves the problem. — Divakar, Sep 29 '17 at 05:22

Shuffle 4-dimensional time-series by voxel

2 Answers2