0

After reading the questions here and here, it seems that the OWNDATA flag isn't always reliable for numpy arrays in determining whether an object is a copy of another one or not.

The answers to these questions seem to say however that OWNDATA sometimes produces 'false negatives' (edit: on second thought, that might be more of a 'false positive') i.e. answers 'false' when in fact an object is a copy, and can be safely changed without changing the original.

Now I'm wondering: is the following a case of the flag also sometimes yielding false positives, i.e. claiming something is a copy, when it really isn't? (Question, part 1) Alternatively, I misunderstand what the OWNDATA flag is intended to tell me...

a = np.arange(3)
b = a
print(b.flags['OWNDATA']) # True

b = b+1
print(a==b) # False => b is copy, matching `OWNDATA` flag

b = a
print(b.flags['OWNDATA']) # True
b += 1
print(a==b) # True => b not a copy, mismatch with `OWNDATA`?

(Question, part 2) Finally: if neither OWNDATA, or a.base is b are reliable indicators to tell whether an object is a copy or not, then what what is the right to determine it? The questions linked above mention may_share_memory, but that one seems to be overeager in the other direction, answering 'True' on anything that is not constructed or created as a an explicit np.copy of another object.

Community
  • 1
  • 1
Bert Zangle
  • 177
  • 1
  • 9
  • Why do you need to know this? Something that you will test for in your code? Or are you trying to learn which operations copy and which share? – hpaulj Oct 30 '16 at 12:01
  • @hpaulj A combination, I guess. I'd like to know if there is a general method to determine it, and, practically, I'd prefer not having to either (a) copy every time I'm unsure, (b) if I don't copy, test extensively, or (c) memorize which np operations copy or not under which exact conditions. – Bert Zangle Oct 30 '16 at 12:04
  • Looks like you already found my answer in the 2nd link. – hpaulj Oct 30 '16 at 12:14
  • @hpaulj Yes, I read it, but I'm not sure I understood if it also answers my question above (part 1). I think I get how `OWNDATA` can answer `False` in cases where it seems counterintuitive, but I don't understand how there can be a case of `OWNDATA` being `True` (like in my example) but the original array data changing nonetheless. – Bert Zangle Oct 30 '16 at 18:37
  • I think you are confusing share object references with shared data buffers. See my answer. – hpaulj Oct 30 '16 at 19:48

1 Answers1

1

Looking more at your example:

a = np.arange(3)
b = a
print(b.flags['OWNDATA']) # True

b is a reference to a; they are the same Python object

b = b+1
print(a==b) # False => b is copy, matching `OWNDATA` flag

b is now a new array, produced by the addition operation. It no longer points to the original array. You could just as well looked a c = b+1, and tested c. And it does not share the data buffer with the original b.

b = a
print(b.flags['OWNDATA']) # True
b += 1
print(a==b) # True => b not a copy, mismatch with `OWNDATA`?

b has been modified in-place; it's the same array object that is was before, but with new values. Since a references the same object, it also 'appears' to be changed. b is a and still has its own data buffer. You might also look at the id(b), and id(a).

OWNDATA is meaningful only when comparing one array with a view or other operation - a different array object which may or may not share a data buffer with the original.

You may need to dig more into what b=a does in Python (it's not a numpy issue), and how arrays are constructed with data buffers.

hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • "You may need to dig more into what b=a does in Python (it's not a numpy issue), and how arrays are constructed with data buffers." When writing my question I had the suspicion already I'm missing something more basic here... thanks a lot for taking the time to clear up my misunderstanding. Marking as answered. – Bert Zangle Oct 30 '16 at 22:01