0

I'm searching for fast way to copy a object or list. Found following suggestions

b = a[:] <-- fast
b = a.copy() <-- slower

Yes, it worked but yet problem remains. if I change the content of b then the content of a` is also changed, why?

--- following is my trial code ---

>>> import numpy as np
>>> a = np.zeros([4,4])
>>> a
array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])
>>> hex(id(a))
'0x16b21959a30'
>>> b = a
>>> hex(id(a)), hex(id(b))
'0x16b21959a30', '0x16b21959a30'
>>> c = a[:]
>>> hex(id(a)), hex(id(b)), hex(id(c))
('0x16b21959a30', '0x16b21959a30', '0x16b1fc54800')

Here, we found address of c is different from others. (address of a and b is same)

So now try to change content of c and verify content of a.

>>> c[0][0]
0.0
>>> c[0][0] = 11
>>> c
array([[11.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.]])
>>> a
array([[11.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.]])
>>> b
array([[11.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.]])

I changed only c[0][0], but I see a[0][0] and b[0][0] is also changed. Why?

martineau
  • 119,623
  • 25
  • 170
  • 301
CH J
  • 35
  • 1
  • 4
  • 3
    You are not working with a python `list`, but rather, ` numpy.ndarray` object. For python lists, this: `b = a[:] <-- fast b = a.copy() <-- slower` is not true. And `a[:]` doesn't make a copy for `numpy.ndarray`, but rather, a *view*. – juanpa.arrivillaga Nov 27 '19 at 07:58
  • 2
    @Rakesh that duplicate target is talking about `list` objects, not `numpy.ndarray` objects, which have slightly different semantics (slicing does not copy the underlying buffer, i.e. it creates a view) – juanpa.arrivillaga Nov 27 '19 at 08:00
  • You can use `copy.deepcopy` to recursively copy all mutable objects in the collection: `from copy import deepcopy; b = deepcopy(a)` – Reut Sharabani Nov 27 '19 at 08:02
  • You need to understand **shallow copy** and **deep copy** concepts. Check [this](https://stackoverflow.com/questions/184710/what-is-the-difference-between-a-deep-copy-and-a-shallow-copy) – Ersel Er Nov 27 '19 at 08:05
  • 1
    @ReutSharabani no, that isn't needed. This is `numpy`, a different beast regarding these things. – juanpa.arrivillaga Nov 27 '19 at 08:06
  • Correct your title and text. Copy for lists and numpy arrays is different. – hpaulj Nov 27 '19 at 10:09
  • `b` is the same array as `a`. `c` is a `view`, a new array object (different `id`) but shared data buffer. – hpaulj Nov 27 '19 at 17:45

1 Answers1

-1

a [:] does not cause a real memory copy. Only pointers are copied.

copy () causes a real copy. So it takes a long time.

I don't know the internal structure, but deepcopy () is the fastest.

See the experiment code below.

from copy import deepcopy;
import numpy as np
import time

a = np.zeros([5000,5000])

start = time.time()
c = a[:]
print("time :", time.time() - start)
# 5.245208740234375e-06

start = time.time()
d = a.copy()
print("time :", time.time() - start)
# 0.33116960525512695

start = time.time()
e = deepcopy(a)
print("time :", time.time() - start)
# 0.15706825256347656
  • 1
    No, **this is not a python list but a numpy array**. There are no pointers involved. – juanpa.arrivillaga Nov 27 '19 at 08:12
  • With numpy arrays, `a[:]` just returns a `view`, so is very fast. When I use `ipython` `timeit`, `deepcopy` is slightly slower than copy. `deepcopy` is only useful in `numpy` if the array is an object dtype. It's the same as `copy` for numeric arrays. – hpaulj Nov 27 '19 at 17:39
  • Thank you for telling me. I learned something new! And sorry for sharing the wrong information. Thanks! – Jun-Hyung Park Nov 28 '19 at 00:08