3

I have a very large matrix A with N rows and M columns. I want to basically do the following operation

for k = 1:N
    A(k,:) = A(k,randperm(M));
end

but fast and efficiently. (Both M and N are very large, and this is only an inner loop in a more massive outer loop.)

More context: I am trying to implement a permutation test for a correlation matrix (http://en.wikipedia.org/wiki/Resampling_%28statistics%29). My data is very large and I am very impatient. If anyone knows of a fast way to implement such a test, I would also be grateful to hear your input!

Do I have any hope of avoiding doing this in a loop?

Apologies if this has already been asked. Thanks!

Community
  • 1
  • 1
Y. S.
  • 253
  • 2
  • 7

1 Answers1

4

If you type open randperm (at least in Matlab R2010b) you'll see that its output p for an input M is just

[~, p] = sort(rand(1,M));

So, to vectorize this for N rows,

[~, P] = sort(rand(N,M), 2);

Thus, generate P and use linear indexing into A:

[~, P] = sort(rand(N,M), 2);
A = A(bsxfun(@plus, (1:N).', (P-1)*N));

Example: given

N = 3;
M = 4;
A = [ 1     2     3     4
      5     6     7     8
      9    10    11    12 ];

one (random) result is

A =
     2     3     1     4
     7     5     8     6
     9    11    12    10
Luis Mendo
  • 110,752
  • 13
  • 76
  • 147
  • 1
    Glad it worked! That was an easy one :-) I wonder why `randperm` doesn't have the option to generate multiple permutations at once this way – Luis Mendo Mar 20 '15 at 00:24
  • 1
    agreed, seems like a very simple extension! – Y. S. Mar 20 '15 at 00:47
  • Beware that there's a non-zero probability of collisions using this method: i.e. what if there are duplicate numbers in the output of `rand`? Also, to generate an NxM matrix like this has O(N M log(M)) time complexity, rather than the optimal O(N M). See https://en.wikipedia.org/wiki/Shuffling#Shuffling_algorithms – Nzbuu Dec 03 '20 at 12:59
  • Having duplicate numbers in the output would not be a problem. `sort` would just keep their relative order to produce the output – Luis Mendo Dec 03 '20 at 15:14