1

I want to check if there is any object with a specific attribute in my numpy array:

class Test:
def __init__(self, name):
    self.name = name

l = numpy.empty( (2,2), dtype=object)
l[0][0] = Test("A")
l[0][1] = Test("B")
l[1][0] = Test("C")
l[1][1] = Test("D")

I know that following line of code working for a list, but what is the alternative process for a numpy array?

print numpy.any(l[:,0].name == "A")
Marios
  • 26,333
  • 8
  • 32
  • 52
  • possible duplicated. see this post http://stackoverflow.com/questions/7088625/what-is-the-most-efficient-way-to-check-if-a-value-exists-in-a-numpy-array – GoingMyWay Sep 30 '16 at 01:09
  • I believe this is another situation since I have an array of objects. [The other post](http://stackoverflow.com/questions/7088625/what-is-the-most-efficient-way-to-check-if-a-value-exists-in-a-numpy-array) asking how to find about the existence of a value in an array. – Setareh Behroozi Sep 30 '16 at 01:16
  • Object arrays are little more than glorified lists. They can have a 2d shape, but they don't have the math powers of a regular array. Why are you using them? – hpaulj Sep 30 '16 at 01:29
  • Actually, I need my data to be exactly in a 2D structure. – Setareh Behroozi Sep 30 '16 at 01:55

3 Answers3

3

There are a number SO questions about object dtype arrays, and even some about getting or testing the attributes of the elements of such arrays. The general point is that such an array behaves much like a list. For the most part you have to iterate over the array as though it were a list. Most of the cool things that you can do with numeric arrays like add, multiply etc do not apply. Or your objects have to implement specific methods for the actions to propagate down to the objects.

In [15]: class Test:
    ...:     def __init__(self,name):
    ...:         self.name=name
    ...:     def __repr__(self):   # added for nice display
    ...:         return 'Test:%s'%self.name
    ...:     
In [16]: A = np.empty((2,2),dtype=object)

I can assign all the values at once using flat[:] and list:

In [17]: A.flat[:]=[Test('A'),Test('B'),Test('C'),Test('D')]

In [18]: A      # try this display without the `repr`
Out[18]: 
array([[Test:A, Test:B],
       [Test:C, Test:D]], dtype=object)

this returns False because I did not define a cmp for the class; that is Test('A')==Test('A') is also False.

In [19]: Test('A') in A
Out[19]: False


In [20]: A[0,1].name
Out[20]: 'B'

This is true because it is an identity test

In [21]: A[0,1] in A
Out[21]: True

Since A is 2d, a simple list comprehension on it does not work, at least not for testing attributes. a in this case is a row of A, a 1d object array:

In [23]: [a.name for a in A]
...
AttributeError: 'numpy.ndarray' object has no attribute 'name'

To get the names I have to iterated on A.flat; I can apply the in test to the resulting list:

In [24]: [a.name for a in A.flat]
Out[24]: ['A', 'B', 'C', 'D']
In [25]: 'B' in [a.name for a in A.flat]
Out[25]: True

np.vectorize is a way of writing functions that operate on arrays of various shapes. It uses np.frompyfunc, which in this case works just as well, if not better.

In [27]: foo = np.frompyfunc(lambda a: a.name, 1,1)
In [28]: foo(A)
Out[28]: 
array([['A', 'B'],
       ['C', 'D']], dtype=object)
In [29]: 'C' in foo(A)
Out[29]: True

Or I could define a version that does the name equality test. Notice this takes 2 inputs

In [30]: bar = np.frompyfunc(lambda a,b: b == a.name, 2, 1)

In [32]: bar(A,'C')
Out[32]: 
array([[False, False],
       [True, False]], dtype=object)

I can even test 2 arrays against each other with broadcasting:

In [37]: bar(A,np.array(['A','D'])[:,None,None])
Out[37]: 
array([[[True, False],
        [False, False]],

       [[False, False],
        [False, True]]], dtype=object)

frompyfunc iterates as the [a for a in A.flat] does, but is somewhat faster, and throws all the power of numpy broadcasting at the task.

hpaulj
  • 221,503
  • 14
  • 230
  • 353
1

One simple way would be creating your array object by inheriting from Numpy's ndarray object. Then use a custom function for checking the existence of your object based on the name attribute:

In [71]: class Myarray(np.ndarray):
   ....:         def __new__(cls, inputarr):
   ....:                 obj = np.asarray(inputarr).view(cls)
   ....:                 return obj
   ....:         def custom_contain(self, name):
   ....:                 return any(obj.name == name for obj in self.flat)

Demo:

In [4]: A = np.empty((2,2),dtype=object)    
In [8]: A.flat[:] = [Test("A"), Test("B"), Test("C"), Test("D")]

In [9]: A
Out[9]: 
array([[<__main__.Test instance at 0x7fae0a14ddd0>,
        <__main__.Test instance at 0x7fae0a14de18>],
       [<__main__.Test instance at 0x7fae0a14de60>,
        <__main__.Test instance at 0x7fae0a14dea8>]], dtype=object)

In [11]: A = Myarray(A)

In [12]: A.custom_contain('C')
Out[12]: True

In [13]: A.custom_contain('K')
Out[13]: False
Mazdak
  • 105,000
  • 18
  • 159
  • 188
  • Thank you for your informative comment, but still you are having a list as your array, which is totally working with any(). – Setareh Behroozi Sep 30 '16 at 01:34
  • How about changeing the code to this: l = np.empty( (2,2), dtype=object) l = Myarray(l) print l l[0][0] = Test("A") l[0][1] = Test("B") l[1][0] = Test("C") l[1][1] = Test("D") print l.custom_contain('C') – Setareh Behroozi Sep 30 '16 at 01:35
  • 1
    The answer iterates over the array, just as though it were a list. That is typical object array operations – hpaulj Sep 30 '16 at 01:39
  • @hpaulj So you mean, the custom_contain function is still working if you pass an array to the Myarray constructor? which is not, unfortunately. – Setareh Behroozi Sep 30 '16 at 01:42
  • I mean `custom_contain` has a `for obj in self` phrase. If `self` is a 2d array it may need `for obj in self.flat`. – hpaulj Sep 30 '16 at 02:09
  • @SetarehBehroozi Yes, I missed that, but as hpaulj described you can use `self.flat` within generator expression in order to loop over the flatten array. – Mazdak Sep 30 '16 at 10:34
0

I'm clearly not proficient in numpy but couldn't you just do something like:

numpy.any([ x.name=='A' for x in l[:,0] ])

edit: (Google tells me that) it's possible to iterate over arrays with nditer; is this what you want?

numpy.any([ x.name=='A' for x in numpy.nditer(l) ])
n.caillou
  • 1,263
  • 11
  • 15