There are a number SO questions about object dtype arrays, and even some about getting or testing the attributes of the elements of such arrays. The general point is that such an array behaves much like a list. For the most part you have to iterate over the array as though it were a list. Most of the cool things that you can do with numeric arrays like add
, multiply
etc do not apply. Or your objects have to implement specific methods for the actions to propagate down to the objects.
In [15]: class Test:
...: def __init__(self,name):
...: self.name=name
...: def __repr__(self): # added for nice display
...: return 'Test:%s'%self.name
...:
In [16]: A = np.empty((2,2),dtype=object)
I can assign all the values at once using flat[:]
and list:
In [17]: A.flat[:]=[Test('A'),Test('B'),Test('C'),Test('D')]
In [18]: A # try this display without the `repr`
Out[18]:
array([[Test:A, Test:B],
[Test:C, Test:D]], dtype=object)
this returns False because I did not define a cmp
for the class; that is Test('A')==Test('A')
is also False.
In [19]: Test('A') in A
Out[19]: False
In [20]: A[0,1].name
Out[20]: 'B'
This is true because it is an identity test
In [21]: A[0,1] in A
Out[21]: True
Since A
is 2d, a simple list comprehension on it does not work, at least not for testing attributes. a
in this case is a row of A
, a 1d object array:
In [23]: [a.name for a in A]
...
AttributeError: 'numpy.ndarray' object has no attribute 'name'
To get the names I have to iterated on A.flat
; I can apply the in
test to the resulting list:
In [24]: [a.name for a in A.flat]
Out[24]: ['A', 'B', 'C', 'D']
In [25]: 'B' in [a.name for a in A.flat]
Out[25]: True
np.vectorize
is a way of writing functions that operate on arrays of various shapes. It uses np.frompyfunc
, which in this case works just as well, if not better.
In [27]: foo = np.frompyfunc(lambda a: a.name, 1,1)
In [28]: foo(A)
Out[28]:
array([['A', 'B'],
['C', 'D']], dtype=object)
In [29]: 'C' in foo(A)
Out[29]: True
Or I could define a version that does the name
equality test. Notice this takes 2 inputs
In [30]: bar = np.frompyfunc(lambda a,b: b == a.name, 2, 1)
In [32]: bar(A,'C')
Out[32]:
array([[False, False],
[True, False]], dtype=object)
I can even test 2 arrays against each other with broadcasting:
In [37]: bar(A,np.array(['A','D'])[:,None,None])
Out[37]:
array([[[True, False],
[False, False]],
[[False, False],
[False, True]]], dtype=object)
frompyfunc
iterates as the [a for a in A.flat]
does, but is somewhat faster, and throws all the power of numpy broadcasting at the task.