1

What's the best way to do the following in Numpy when dealing with symmetric square matrices (NxN) where N > 20000?

>>> a = np.arange(9).reshape([3,3])
>>> a = np.maximum(a, a.T)
>>> a
array([[0, 3, 6],
       [3, 4, 7],
       [6, 7, 8]])
>>> perm = np.random.permutation(3)
>>> perm
array([1, 0, 2])
>>> shuffled_arr = a[perm, :][:, perm]
>>> shuffled_arr
array([[4, 3, 7],
       [3, 0, 6],
       [7, 6, 8]])

This takes about 6-7 secs when N is about 19K. While the same opertation in Matlab takes less than a second:

perm = randperm(N);
shuffled_arr = arr(perm, perm);
NULL
  • 759
  • 9
  • 18

1 Answers1

2
In [703]: N=10000
In [704]: a=np.arange(N*N).reshape(N,N);a=np.maximum(a, a.T)
In [705]: perm=np.random.permutation(N)

One indexing step is quite a bit faster:

In [706]: timeit a[perm[:,None],perm]   # same as `np.ix_...`
1 loop, best of 3: 1.88 s per loop

In [707]: timeit a[perm,:][:,perm]
1 loop, best of 3: 8.88 s per loop

In [708]: timeit np.take(np.take(a,perm,0),perm,1)
1 loop, best of 3: 1.41 s per loop

a[perm,perm[:,None]] is in the 8s category.

hpaulj
  • 221,503
  • 14
  • 230
  • 353