85

For a minimal working example, let's digitize a 2D array. numpy.digitize requires a 1D array:

import numpy as np
N = 200
A = np.random.random((N, N))
X = np.linspace(0, 1, 20)
print np.digitize(A.ravel(), X).reshape((N, N))

Now the documentation says:

... A copy is made only if needed.

How do I know if the ravel copy it is "needed" in this case? In general - is there a way I can determine if a particular operation creates a copy or a view?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Hooked
  • 84,485
  • 43
  • 192
  • 261
  • If you'll want to force copy, the best thing I found is to use np.copy, or np.array like `tr = np.array(a.T, copy=True) ` – Íhor Mé Sep 20 '16 at 01:10

2 Answers2

90

This question is very similar to a question that I asked a while back:

You can check the base attribute.

a = np.arange(50)
b = a.reshape((5, 10))
print (b.base is a)

However, that's not perfect. You can also check to see if they share memory using np.may_share_memory.

print (np.may_share_memory(a, b))

There's also the flags attribute that you can check:

print (b.flags['OWNDATA'])  #False -- apparently this is a view
e = np.ravel(b[:, 2])
print (e.flags['OWNDATA'])  #True -- Apparently this is a new numpy object.

But this last one seems a little fishy to me, although I can't quite put my finger on why...

Community
  • 1
  • 1
mgilson
  • 300,191
  • 65
  • 633
  • 696
  • Interesting, thanks for the links to your answers. I'll leave my question up as the wording "view" versus "share" is different (and didn't come up in a search). – Hooked Jul 17 '12 at 14:37
  • @Hooked -- Yeah, this is just barely different enough for me to answer your question instead of mark it as a duplicate (I don't know what others will think). Anyway, hopefully this is helpful. – mgilson Jul 17 '12 at 14:40
  • 2
    I was just trying some of these out and using `flags['OWNDATA']` can definitely fail in some cases. In your example, if you use `e = np.reshape(b[:, 2], -1)` instead of `ravel`, `flags['OWNDATA']` will be False, even though a copy was made. – amicitas Jan 11 '13 at 03:48
  • 3
    @amicitas, that's because `e` is in fact a view on `e.base` which itself is the actual copy of the array produced by the `reshape` operation. See [further here](http://stackoverflow.com/q/28886731/2476444). – Oliver W. Mar 05 '15 at 23:22
  • 2
    @mgilson In which way is the first solution not perfect? – Malin Oct 30 '20 at 08:28
22

In the documentation for reshape there is some information about how to ensure an exception if a view cannot be made:

It is not always possible to change the shape of an array without copying the data. If you want an error to be raised if the data is copied, you should assign the new shape to the shape attribute of the array:

>>> a = np.zeros((10, 2))
# A transpose make the array non-contiguous
>>> b = a.T
# Taking a view makes it possible to modify the shape without modiying the
# initial object.
>>> c = b.view()
>>> c.shape = (20)
AttributeError: incompatible shape for a non-contiguous array



This is not exactly an answer to your question, but in certain cases it may be just as useful.

amicitas
  • 13,053
  • 5
  • 38
  • 50
  • I'm still confused as to why this works. Why should I be able to apply a shape to a non-view? – Seanny123 Apr 06 '18 at 17:26
  • This is really telling you if the data is contiguous, not if it is a view. While it's true that non-contiguous data can only be a view, the converse is not true. For example, `a = np.zeros((10, 4)); b = a[0]; b.shape = (2, 2)` works just fine, because `b` is a (C) contiguous view of `a` (which you can already see from `b.flags['C_CONTIGUOUS']`). Also it should be noted that `c = b.view()` does nothing in this example. `a.T` is already a view of `a`. – asmeurer Jun 15 '21 at 22:33