355

What's the easiest way to shuffle an array with python?

Machavity
  • 30,841
  • 27
  • 92
  • 100
davethegr8
  • 11,323
  • 5
  • 36
  • 61

11 Answers11

612
import random
random.shuffle(array)
David Z
  • 128,184
  • 27
  • 255
  • 279
  • 11
    is there an option that doesn't mutate the original array but return a new shuffled array? – Charlie Parker Mar 29 '17 at 17:48
  • @Charlie That would be a good thing to ask in a separate question. (Maybe someone else has already asked it.) – David Z Mar 29 '17 at 18:17
  • 1
    @{Charlie Parker} Just make a copy of the original array before using random.shuffle: ` copy_of array = array.copy() random.shuffle(copy_of_array) ` – Bobby Zandavi Jun 02 '20 at 20:38
  • @CharlieParker From the python docs `shuffled = sample(array, k=len(array))` – allenh Jun 10 '21 at 13:38
  • out put is none for a4 = np.array((blockshaped(c, 2, 3)))[3] where blockshaped return nd'array – Tushar Kshirsagar Sep 25 '21 at 16:46
  • 1
    @Tushar Despite the name, the object you get from `np.array()` is not an "array" in the sense of this question. You may want to look for another question to find out how to shuffle a _Numpy_ array specifically. (Or you can search the web to find the right page in the Numpy documentation.) – David Z Sep 25 '21 at 21:22
117
import random
random.shuffle(array)
Douglas Leeder
  • 52,368
  • 9
  • 94
  • 137
54

Alternative way to do this using sklearn

from sklearn.utils import shuffle
X=[1,2,3]
y = ['one', 'two', 'three']
X, y = shuffle(X, y, random_state=0)
print(X)
print(y)

Output:

[2, 1, 3]
['two', 'one', 'three']

Advantage: You can random multiple arrays simultaneously without disrupting the mapping. And 'random_state' can control the shuffling for reproducible behavior.

Qy Zuo
  • 2,622
  • 24
  • 21
27

Just in case you want a new array you can use sample:

import random
new_array = random.sample( array, len(array) )
Federico klez Culloca
  • 26,308
  • 17
  • 56
  • 95
Charlie Parker
  • 5,884
  • 57
  • 198
  • 323
20

The other answers are the easiest, however it's a bit annoying that the random.shuffle method doesn't actually return anything - it just sorts the given list. If you want to chain calls or just be able to declare a shuffled array in one line you can do:

    import random
    def my_shuffle(array):
        random.shuffle(array)
        return array

Then you can do lines like:

    for suit in my_shuffle(['hearts', 'spades', 'clubs', 'diamonds']):
Mark Rhodes
  • 10,049
  • 4
  • 48
  • 51
  • 11
    It doesn't return anything *specifically* because it is trying to remind you that it works by altering the input in place. (This can save memory.) Your function alters its input in place also. – John Y Dec 20 '11 at 22:13
  • 2
    I guess it's a style thing. Personally I prefer the fact that I can write a single line to achieve what would take a couple otherwise. It seems odd to me that a language which aims to allow programs to be as short as possible doesn't tend to return the passed object in these cases. Since it alters the input in place, you can replace a call to random.shuffle for a call to this version without issue. – Mark Rhodes Dec 21 '11 at 14:39
  • 12
    Python doesn't actually aim to be as brief as possible. Python aims to balance readability with expressivity. It so happens to be fairly brief, mainly because it is a very high-level language. Python's own built-ins *typically* (not always) strive to *either* be "functionlike" (return a value, but don't have side effects) *or* be "procedurelike" (operate via side effects, and don't return anything). This goes hand-in-hand with Python's quite strict distinction between statements and expressions. – John Y Dec 21 '11 at 18:37
  • Nice. I suggest renaming it to my_shuffle to see the difference in the code immediately. – Jabba Feb 23 '12 at 09:21
  • Maybe, but this could be premature optimization (it could be helpful, but the need to shuffle doesn't explicitly require the need to return the array). Also, shuffle(array) followed by some use of shuffle would only be 2 lines as opposed to 3 + n (times usage), although I guess it would be a saving if you use it many times. Here is a great video that discusses this type of thing (e.g. phantom requirements and premature optimisation) - http://pyvideo.org/video/880/stop-writing-classes – Aaron Newton Apr 21 '12 at 01:23
15

When dealing with regular Python lists, random.shuffle() will do the job just as the previous answers show.

But when it come to ndarray(numpy.array), random.shuffle seems to break the original ndarray. Here is an example:

import random
import numpy as np
import numpy.random

a = np.array([1,2,3,4,5,6])
a.shape = (3,2)
print a
random.shuffle(a) # a will definitely be destroyed
print a

Just use: np.random.shuffle(a)

Like random.shuffle, np.random.shuffle shuffles the array in-place.

abcd
  • 10,215
  • 15
  • 51
  • 85
Shuai Zhang
  • 2,011
  • 3
  • 22
  • 23
8

You can sort your array with random key

sorted(array, key = lambda x: random.random())

key only be read once so comparing item during sort still efficient.

but look like random.shuffle(array) will be faster since it written in C

this is O(Nlog(N)) btw

James
  • 13,571
  • 6
  • 61
  • 83
  • This is O(n log n), not O(log n), and inherently slower than an O(n) shuffle, C aspect aside. – Ry- Jan 22 '23 at 23:05
5

In addition to the previous replies, I would like to introduce another function.

numpy.random.shuffle as well as random.shuffle perform in-place shuffling. However, if you want to return a shuffled array numpy.random.permutation is the function to use.

Saber
  • 194
  • 2
  • 8
2

I don't know I used random.shuffle() but it return 'None' to me, so I wrote this, might helpful to someone

def shuffle(arr):
    for n in range(len(arr) - 1):
        rnd = random.randint(0, (len(arr) - 1))
        val1 = arr[rnd]
        val2 = arr[rnd - 1]

        arr[rnd - 1] = val1
        arr[rnd] = val2

    return arr
Jeeva
  • 1,029
  • 3
  • 15
  • 21
  • 4
    yes it returns None, but array is modifed, if you really want to return something then do this import random def shuffle(arr): random.shuffle(arr) return arr – user781903 Feb 08 '17 at 12:25
  • This swaps n−1 random pairs of adjacent items, which isn’t a correct (uniform) shuffle. – Ry- Jan 22 '23 at 23:13
0

Be aware that random.shuffle() should not be used on multi-dimensional arrays as it causes repetitions.

Imagine you want to shuffle an array along its first dimension, we can create the following test example,

import numpy as np
x = np.zeros((10, 2, 3))

for i in range(10):
   x[i, ...] = i*np.ones((2,3))

so that along the first axis, the i-th element corresponds to a 2x3 matrix where all the elements are equal to i.

If we use the correct shuffle function for multi-dimensional arrays, i.e. np.random.shuffle(x), the array will be shuffled along the first axis as desired. However, using random.shuffle(x) will cause repetitions. You can check this by running len(np.unique(x)) after shuffling which gives you 10 (as expected) with np.random.shuffle() but only around 5 when using random.shuffle().

Wise Cloud
  • 411
  • 4
  • 5
-1
# arr = numpy array to shuffle

def shuffle(arr):
    a = numpy.arange(len(arr))
    b = numpy.empty(1)
    for i in range(len(arr)):
        sel = numpy.random.random_integers(0, high=len(a)-1, size=1)
        b = numpy.append(b, a[sel])
        a = numpy.delete(a, sel)
    b = b[1:].astype(int)
    return arr[b]
MBT
  • 21,733
  • 19
  • 84
  • 102