OWNDATA flag unreliable in both directions for numpy arrays?

Question

After reading the questions here and here, it seems that the OWNDATA flag isn't always reliable for numpy arrays in determining whether an object is a copy of another one or not.

The answers to these questions seem to say however that OWNDATA sometimes produces 'false negatives' (edit: on second thought, that might be more of a 'false positive') i.e. answers 'false' when in fact an object is a copy, and can be safely changed without changing the original.

Now I'm wondering: is the following a case of the flag also sometimes yielding false positives, i.e. claiming something is a copy, when it really isn't? (Question, part 1) Alternatively, I misunderstand what the OWNDATA flag is intended to tell me...

a = np.arange(3)
b = a
print(b.flags['OWNDATA']) # True

b = b+1
print(a==b) # False => b is copy, matching `OWNDATA` flag

b = a
print(b.flags['OWNDATA']) # True
b += 1
print(a==b) # True => b not a copy, mismatch with `OWNDATA`?

(Question, part 2) Finally: if neither OWNDATA, or a.base is b are reliable indicators to tell whether an object is a copy or not, then what what is the right to determine it? The questions linked above mention may_share_memory, but that one seems to be overeager in the other direction, answering 'True' on anything that is not constructed or created as a an explicit np.copy of another object.

Why do you need to know this? Something that you will test for in your code? Or are you trying to learn which operations copy and which share? — hpaulj, Oct 30 '16 at 12:01
@hpaulj A combination, I guess. I'd like to know if there is a general method to determine it, and, practically, I'd prefer not having to either (a) copy every time I'm unsure, (b) if I don't copy, test extensively, or (c) memorize which np operations copy or not under which exact conditions. — Bert Zangle, Oct 30 '16 at 12:04
@hpaulj Yes, I read it, but I'm not sure I understood if it also answers my question above (part 1). I think I get how `OWNDATA` can answer `False` in cases where it seems counterintuitive, but I don't understand how there can be a case of `OWNDATA` being `True` (like in my example) but the original array data changing nonetheless. — Bert Zangle, Oct 30 '16 at 18:37
I think you are confusing share object references with shared data buffers. See my answer. — hpaulj, Oct 30 '16 at 19:48

score 1 · Accepted Answer · answered Oct 30 '16 at 19:47

Looking more at your example:

a = np.arange(3)
b = a
print(b.flags['OWNDATA']) # True

b is a reference to a; they are the same Python object

b = b+1
print(a==b) # False => b is copy, matching `OWNDATA` flag

b is now a new array, produced by the addition operation. It no longer points to the original array. You could just as well looked a c = b+1, and tested c. And it does not share the data buffer with the original b.

b = a
print(b.flags['OWNDATA']) # True
b += 1
print(a==b) # True => b not a copy, mismatch with `OWNDATA`?

b has been modified in-place; it's the same array object that is was before, but with new values. Since a references the same object, it also 'appears' to be changed. b is a and still has its own data buffer. You might also look at the id(b), and id(a).

OWNDATA is meaningful only when comparing one array with a view or other operation - a different array object which may or may not share a data buffer with the original.

You may need to dig more into what b=a does in Python (it's not a numpy issue), and how arrays are constructed with data buffers.

"You may need to dig more into what b=a does in Python (it's not a numpy issue), and how arrays are constructed with data buffers." When writing my question I had the suspicion already I'm missing something more basic here... thanks a lot for taking the time to clear up my misunderstanding. Marking as answered. — Bert Zangle, Oct 30 '16 at 22:01

OWNDATA flag unreliable in both directions for numpy arrays?

1 Answers1