2

I have a 6-column multidimensional array like:

[59, '591', '592', '593', '594', 1582823720],
[9, '91', '92', '93', '94', 1582823745],
[7, '71', '72', '73', '74', 1582823745],
[61, '611', '612', '613', '614', 1582823752],
[54, '541', '542', '543', '544', 1582823717],
[24, '241', '242', '243', '244', 1582823706]

Is there an easy way to shuffle only specific columns "vertically" while retaining other columns content intact?

For example above, lets say I need only to "vertically" shuffle columns 2-5, while leaving column 1 and 6 as is, so the result will be:

[59, '541', '242', '243', '74', 1582823720],
[9, '591', '542', '593', '94', 1582823745],
[7, '241', '612', '543', '614', 1582823745],
[61, '611', '92', '73', '544', 1582823752],
[54, '71', '72', '613', '594', 1582823717],
[24, '91', '592', '93', '244', 1582823706]

I am new to Python and maybe there is a simple built-in solution or a certain module that would do it?

I've came across numpy library that made shuffling entire array rows "vertically" a breeze with a random.shuffle() function, maybe there is one to just shuffle specific columns?

petezurich
  • 9,280
  • 9
  • 43
  • 57
Acidon
  • 1,294
  • 4
  • 23
  • 44
  • For a vectorized solution : `a[:,1:5] = shuffle_along_axis(a[:,1:5], axis=0)` from https://stackoverflow.com/a/55317373/. – Divakar Feb 28 '20 at 16:02
  • @Divikar ran into `TypeError: list indices must be integers or slices, not tuple` error when tried to use the `shuffle_along_axis` function... – Acidon Feb 28 '20 at 16:20
  • Yeah, I assumed the input `a` an array, as you mentioned `multidimensional array ` in the question. – Divakar Feb 28 '20 at 16:28

4 Answers4

1

you can do it with numpy shuffle function

x=np.array(yourlist)    
np.random.shuffle(x[:,1:5])

for horizontal shuffle you can use the transpose

np.random.shuffle(x.T[:,1:5])

example for vertical shuffle

x = np.arange(36).reshape(6,6)
x
array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35]])
np.random.shuffle(x[:,1:5])
x
array([[ 0,  7,  8,  9, 10,  5],
       [ 6,  1,  2,  3,  4, 11],
       [12, 19, 20, 21, 22, 17],
       [18, 25, 26, 27, 28, 23],
       [24, 13, 14, 15, 16, 29],
       [30, 31, 32, 33, 34, 35]])
Aly Hosny
  • 827
  • 5
  • 13
  • 1
    This seems to shuffle the 2-5 columns as a whole row chunks, I need each column to be shuffled individually as shown in the example output. – Acidon Feb 28 '20 at 15:53
  • no it doesn't. it shuffles each colomn individually. check the example I added – Aly Hosny Feb 29 '20 at 18:28
0

Here is a code using numpy.

data = [[59, '541', '242', '243', '74', 1582823720],
    [9, '591', '542', '593', '94', 1582823745],
    [7, '241', '612', '543', '614', 1582823745],
    [61, '611', '92', '73', '544', 1582823752],
    [54, '71', '72', '613', '594', 1582823717],
    [24, '91', '592', '93', '244', 1582823706]
]


import numpy as np
import random 


data_numpy = np.array(data)


def shuffle_column(matrix, col_index_to_shuffle):
  """
  """
  current_data = matrix[:, col_index_to_shuffle]
  random.shuffle(current_data)
  matrix[:, col_index_to_shuffle] = current_data
  return matrix


shuffled_matrix = shuffle_column(data_numpy, 2)
shuffled_matrix

array([['59', '541', '242', '243', '74', '1582823720'],
       ['9', '591', '92', '593', '94', '1582823745'],
       ['7', '241', '592', '543', '614', '1582823745'],
       ['61', '611', '612', '73', '544', '1582823752'],
       ['54', '71', '72', '613', '594', '1582823717'],
       ['24', '91', '542', '93', '244', '1582823706']], dtype='<U21')
user2707389
  • 817
  • 6
  • 12
0

I am not sure if this exists in other libraries, although I believe such functionality should exist. However, I do not need numpy to do it:

  • Transpose the array.
  • Shuffle what subarray you need.
  • Transpose the array back.

A code example to shuffle the FOURTH column is:

import random
# I am using pprint to beautify the output on the terminal
from pprint import pprint
arr = [[59, '591', '592', '593', '594', 1582823720],
       [9, '91', '92', '93', '94', 1582823745],
       [7, '71', '72', '73', '74', 1582823745],
       [61, '611', '612', '613', '614', 1582823752],
       [54, '541', '542', '543', '544', 1582823717],
       [24, '241', '242', '243', '244', 1582823706]
      ]
t_arr = [*zip(*arr)]
# I am converting array elements to lists as the zip() function produce tuples instead of lists.
t_arr = [list(sub_arr) for sub_arr in t_arr]
random.shuffle(t_arr[3])
arr_b = [*zip(*t_arr)]
# Again, converting back to lists
arr_b = [list(sub_arr) for sub_arr in arr_b]
# printing out the results :)
pprint(arr_b)

here is the output:

[[59, '591', '592', '73', '594', 1582823720],
 [9, '91', '92', '243', '94', 1582823745],
 [7, '71', '72', '543', '74', 1582823745],
 [61, '611', '612', '93', '614', 1582823752],
 [54, '541', '542', '613', '544', 1582823717],
 [24, '241', '242', '593', '244', 1582823706]]
Maged Saeed
  • 1,784
  • 2
  • 16
  • 35
0

numpy shuffle can shuffle a sub array in place.

If you want the 4 columns to keep their horizontal consistency, just do

data = np.array(data)
np.random.shuffle(data[1:5])

preceded with np.random.seed(0), it gives

array([['59', '591', '592', '593', '594', '1582823720'],
       ['61', '611', '612', '613', '614', '1582823752'],
       ['54', '541', '542', '543', '544', '1582823717'],
       ['7', '71', '72', '73', '74', '1582823745'],
       ['9', '91', '92', '93', '94', '1582823745'],
       ['24', '241', '242', '243', '244', '1582823706']], dtype='<U11')

If you want the columns to be individually shuffled:

data = np.array(data)
tdata = np.transpose(tdata)
for i in range(1,5): np.shuffle(tdata[i])
data = np.transpose(tdata)

preceded with np.random.seed(0), it gives

array([['59', '241', '92', '613', '244', '1582823720'],
       ['9', '71', '612', '243', '74', '1582823745'],
       ['7', '91', '542', '93', '614', '1582823745'],
       ['61', '611', '592', '73', '544', '1582823752'],
       ['54', '591', '72', '543', '94', '1582823717'],
       ['24', '541', '242', '593', '594', '1582823706']], dtype='<U11')
Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
  • The "columns to be individually shuffled" case is what I am after and your code is seemed to work, however there seems to be few typos: second line should take `data` as argument instead of `tdata`, and in third line should be `np.random.shuffle` instead of `np.shuffle` As for your first "horizontal consistency" case, it only seems to work when I include `data[:,1:5]`as shuffle argument. – Acidon Feb 28 '20 at 16:13