Numpy Array Mask Bytes

Question

I have been trying to get simple numpy masking working without having to resort to non-numpy functions, however, I have ran into what seems like a bug.

import numpy as np
origlist = np.array([[b'\x00'] * 100 ] * 128, dtype=np.object)
origlist[0][0] = b'\x00\x00'
newlist = origlist[0][origlist[0] != b'\x00\x00']

provides newlist as [b'\x00\00', b'\x00', b'\x00', ... ], where as it is supposed to provide it as [b'\x00', b'\x00', ... ].

Similarly,

import numpy as np
origlist = np.array([[b'\x00'] * 100 ] * 128, dtype=np.object)
origlist[0][0] = b'\x00\x00'
newlist = origlist[origlist != b'\x00']

also provides newlist as [b'\x00\00', b'\x00', b'\x00', ... ], where as it is supposed to provide it as [b'\x00\x00' ].

UPDATE

Comparing numpy array of dtype object

I have tried basically everything mentioned in the post above, and nothing helped, all comparisons always evaluated to True. I have also tried replacing the initialization with np.full instead, and same results.

The only way I was able to get it working with numpy version 1.19, is to not use numpy.

import numpy as np
origlist = np.full((128,100),b'\x00',dtype=object) #np.object is depreciated
origlist[0][0] = b'\x00\x00'
newlist = origlist[0].tolist()
newlist = list( filter( (b'\x00').__ne__, newlist))

now returns the correct [b'\x00\x00' ].

If there is any other numpy way to do this, or any numpy way to fix this, please let me know.

`[....]*4` makes 4 references to the same object, not 4 separate copies. This construction is dangerous in lists, and in object dtype arrays. — hpaulj, Apr 28 '21 at 04:01
@hpaulj, this is useful information to know, and the replacement is np.full, but this didn't fix the issue. — user8079, Apr 29 '21 at 00:11

Numpy Array Mask Bytes

0 Answers0