1

I have been trying to get simple numpy masking working without having to resort to non-numpy functions, however, I have ran into what seems like a bug.

import numpy as np
origlist = np.array([[b'\x00'] * 100 ] * 128, dtype=np.object)
origlist[0][0] = b'\x00\x00'
newlist = origlist[0][origlist[0] != b'\x00\x00']

provides newlist as [b'\x00\00', b'\x00', b'\x00', ... ], where as it is supposed to provide it as [b'\x00', b'\x00', ... ].

Similarly,

import numpy as np
origlist = np.array([[b'\x00'] * 100 ] * 128, dtype=np.object)
origlist[0][0] = b'\x00\x00'
newlist = origlist[origlist != b'\x00']

also provides newlist as [b'\x00\00', b'\x00', b'\x00', ... ], where as it is supposed to provide it as [b'\x00\x00' ].

UPDATE

Comparing numpy array of dtype object

I have tried basically everything mentioned in the post above, and nothing helped, all comparisons always evaluated to True. I have also tried replacing the initialization with np.full instead, and same results.

The only way I was able to get it working with numpy version 1.19, is to not use numpy.

import numpy as np
origlist = np.full((128,100),b'\x00',dtype=object) #np.object is depreciated
origlist[0][0] = b'\x00\x00'
newlist = origlist[0].tolist()
newlist = list( filter( (b'\x00').__ne__, newlist))

now returns the correct [b'\x00\x00' ].

If there is any other numpy way to do this, or any numpy way to fix this, please let me know.

user8079
  • 23
  • 4
  • 4
    `[....]*4` makes 4 references to the same object, not 4 separate copies. This construction is dangerous in lists, and in object dtype arrays. – hpaulj Apr 28 '21 at 04:01
  • 1
    @hpaulj, this is useful information to know, and the replacement is np.full, but this didn't fix the issue. – user8079 Apr 29 '21 at 00:11
  • `full` doesn't make copies either. – hpaulj Apr 29 '21 at 01:00

0 Answers0