1

I have a numpy array as:

groups=np.array([('Species1',), ('Species2', 'Species3')], dtype=object).

When I ask np.where(groups == ('Species2', 'Species3')) or even np.where(groups == groups[1]) I get an empty reply: (array([], dtype=int64),)

Why is this and how can I get the indexes for such an element?

user3329732
  • 346
  • 2
  • 15

4 Answers4

1

Yes you can search it but not with the np.where but with the hep of for loop and if-else

for index,var in enumerate(groups):
    if var == ('Species2', 'Species3'):
        print("('Species2', 'Species3') -->>", index)
    else:
        print("('Species1',) -->>", index)

Output

('Species1',) -->> 0
('Species2', 'Species3') -->> 1
Rahul charan
  • 765
  • 7
  • 15
1

The problem here is probably the way array.__contains__() is implemented. See here. Basically the issue is that

print(('Species2', 'Species3') in groups)

prints False. If you want to use the numpy.where function nonetheless, and not a for loop as the other answer suggests, it is probably best to somehow construct a suitable truth mask. For example

x = np.array(list(map(lambda x: x== ('Species2', 'Species3'), groups)))
print(np.where(x))

gives the correct result. There might be a more elegant way though.

Banana
  • 1,149
  • 7
  • 24
1

It's not means search a tuple('Species2', 'Species3') from groups when you use

np.where(groups == ('Species2', 'Species3'))

it means search 'Species2' and 'Species3' separately if you have a Complete array like this

groups=np.array([('Species1',''), ('Species2', 'Species3')], dtype=object)

Jaymin
  • 2,879
  • 3
  • 19
  • 35
Mao
  • 48
  • 6
0

Your array has two tuples:

In [53]: groups=np.array([('Species1',), ('Species2', 'Species3')], dtype=object)                    
In [54]: groups                                                                                      
Out[54]: array([('Species1',), ('Species2', 'Species3')], dtype=object)
In [55]: groups.shape                                                                                
Out[55]: (2,)

But be careful with that kind of definition. If the tuples were all the same size, the array would have a different shape, and the elements would no longer be tuples.

In [56]: np.array([('Species1',), ('Species2',), ('Species3',)], dtype=object)                       
Out[56]: 
array([['Species1'],
       ['Species2'],
       ['Species3']], dtype=object)
In [57]: _.shape                                                                                     
Out[57]: (3, 1)

Any use of where is only as good as the boolean array given to it. This where returns empty because the equality test produces all False:

In [58]: np.where(groups == groups[1])                                                               
Out[58]: (array([], dtype=int64),)
In [59]: groups == groups[1]                                                                         
Out[59]: array([False, False])

If I use a list comprehension to compare the group elements:

In [60]: [g == groups[1] for g in groups]                                                            
Out[60]: [False, True]
In [61]: np.where([g == groups[1] for g in groups])                                                  
Out[61]: (array([1]),)

But for this sort of thing, a list would be just as good

In [66]: alist = [('Species1',), ('Species2', 'Species3')]                                           
In [67]: alist.index(alist[1])                                                                       
Out[67]: 1
In [68]: alist.index(('Species1',))                                                                  
Out[68]: 0
In [69]: alist.index(('Species2',))                                                                  
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-69-0b16b56ad28c> in <module>
----> 1 alist.index(('Species2',))

ValueError: ('Species2',) is not in list
hpaulj
  • 221,503
  • 14
  • 230
  • 353