0

I am trying to do a simple 1D-array shuffle in Python but keeping a copy of the original array. However, when calling shuffle commands (either np.random.shuffle or random.shuffle) Python will shuffle all of them in sync.

Example:

import numpy as np
arr = np.arange(10)
arr_backup = arr
print(arr)
print(arr_backup)
np.random.shuffle(arr)
print(arr)
print(arr_backup)

This prints:

[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
[1 4 9 5 8 6 3 2 7 0]
[1 4 9 5 8 6 3 2 7 0]

I guess I am not understanding how Python allocates this item in the namespace or something. Any help is appreciated. Thanks.

user191919
  • 724
  • 1
  • 9
  • 27
  • 1
    ```arr``` and ```arr_backup``` reference the same object in memory. just like with lists, dicts, etc, you have to explicitly copy the object to a new variable. What you've essentially done here is aliased the variable ```arr``` to ```arr_backup```. Modifying one will modify the other. You can test this by replacing ```np.random.shuffle(arr)``` with ```np.random.shuffle(arr_backup)```. – HelpfulHound Jan 13 '20 at 21:35
  • 2
    The assignment `arr_backup = arr` creates another name *for the same array*. When you then shuffle it, both names point to the same shuffled one. If you want a new copy of the array, try `arr_backup = arr.copy()`. – Lee Daniel Crocker Jan 13 '20 at 21:37
  • So why doesnt this change b to number 3 as well? In [105]: a=2 In [106]: b=a In [107]: a=3 In [108]: b Out[108]: 2 – user191919 Jan 13 '20 at 21:39
  • 1
    @user191919 because **you never change the `int` object** (you can't anyway, because `int` objects are immutable. You simply assign a new object to the name `a`, namely `3`. The same thing works with `numpy.ndarray` objects, so `arr = np.arange(10); arr_backup = arr; arr = np.arange(20); print(arr_backup)` – juanpa.arrivillaga Jan 13 '20 at 22:00
  • @user191919 read the following: https://nedbatchelder.com/text/names.html – juanpa.arrivillaga Jan 13 '20 at 22:02

2 Answers2

3

All variables in Python hold references to objects. Assignment from one variable to another just copies that reference, so both arr and arr_backup point to the same object in memory.

Shuffle mutates the array in-place, so the changes are reflected when you access the object from both references. To avoid that, you can copy the array by using arr_backup = arr[:] or arr_backup = arr.copy()

kmaork
  • 5,722
  • 2
  • 23
  • 40
0

direct reference from the np docs.

x = np.array([1, 2, 3])

y = x

z = np.copy(x)

Note that, when we modify x, y changes, but not z:

x[0] = 10

x[0] == y[0] True

x[0] == z[0] False

Yatish Kadam
  • 454
  • 2
  • 11