0

I have a list/array of numpy arrays, representing objects split into subgroups.

I would like to create a copy of this array where I can swap elements within the subgroups and leave the original groupings unchanged.

The function I've written to do this is:

def group_swap(groups):
# Chooses two random groups and swaps two random elements from each 
group.
    gr = np.copy(groups)
    g1 = np.random.randint(len(gr))
    g2 = np.random.randint(len(gr))
    if g1 != g2:
        e1 = np.random.randint(len(gr[g1]))
        e2 = np.random.randint(len(gr[g2]))
        gr[g1][e1] ,gr[g2][e2] = gr[g2][e2].copy(),gr[g1][e1].copy()
        return(gr)
    else:
        return(groups)

Based on this question, I've been able to swap the elements. However, the elements in the original array are also swapped, as in this example.

a = np.array_split(np.arange(10),3)
print('orginal before swap: ',a)
a_swap = group_swap(a)
print('original after swap: ',a)
print('swapped array: ',a_swap)

Which gives:

original before swap:  
[array([0, 1, 2, 3]), array([4, 5, 6]), array([7, 8, 9])]
original after swap:  
[array([0, 1, 2, 7]), array([4, 5, 6]), array([3, 8, 9])]
swapped array:  
[array([0, 1, 2, 7]) array([4, 5, 6]) array([3, 8, 9])]

Ideally, the array a should be unchanged and only a_swap show the swapped elements. I had hoped that making and working with a copy of the array within my function would do the trick but that hasn't worked.

Could anyone help point out what I might be missing? I have a feeling it's something I'll kick myself for afterwards.

Thanks

PS: Oddly enough, it seems to work if the number of elements in each group is equal, but I'm not seeing why.

   original before swap: 
    [array([0, 1, 2, 3]), array([4, 5, 6, 7]), array([ 8,  9, 10, 11])]
    original after swap:  
    [array([0, 1, 2, 3]), array([4, 5, 6, 7]), array([ 8,  9, 10, 11])]
    swapped array:
    [[ 0  1  8  3]
    [ 4  5  6  7]
    [ 2  9 10 11]]
aqshaw
  • 3
  • 3
  • "I have a list/array of numpy arrays" - you have a list. Not an array. The difference is **crucial**; do not ignore it. – user2357112 Aug 27 '18 at 19:58
  • 1
    `np.array_split` makes a list of arrays. You should leave that as a list, and not try to turn it into an array. If the sublists differ in size, the new array will be object dtype (a bastardized list). If they are all the same size, the new array will be 2d. You don't need that complication. `arr.copy` is quite different in the 2 cases. – hpaulj Aug 27 '18 at 20:34

2 Answers2

1

When the number of components in each element are not equal, you are having a list of arrays (nested object).

When the number of components are equal, then you have a two dimensional array (one single object).

The copy you are using is called shallow copy which copies only the top level object (the 2d array in second case, but only the addresses to the arrays in the first case). So in the first case your original data also are changed. You should use the copy module: https://docs.python.org/3/library/copy.html

anishtain4
  • 2,342
  • 2
  • 17
  • 21
  • That did the trick. Comes with a slow down which I may need to work around when I scale up, but for now it's awesome. Thank you very much. – aqshaw Aug 27 '18 at 20:41
0
a = np.array_split(np.arange(10),3)
a = np.asarray(a)
b = a.copy() -1 +1
print('orginal before swap: ',a)
a_swap = group_swap(b)
print('original after swap: ',a)
print('swapped array: ',a_swap)

From what I can tell, ndarray.copy() takes a shallow copy of the array until some change is made to it. When you call in the variable to the method, it uses the shallow copy of the array, ignoring that it was meant to be changed. By changing the variable before passing it through the method, it changes the memory reference of b to a separate place than a.

a = np.asarray(a) is there to cast from a list to a numpy array so that the -1 +1 is a valid operation. There are probably a lot of different ways to do the same thing, this just seemed the easiest.

Bryce Booze
  • 165
  • 1
  • 11