1

I am learning how to use Graphlab for machine learning. So, I have this dataset with four columns - There is a column 'name' and another 'review'.

Now, I want to get the review of specific product by the name of the product. So, this is what I tried but I keep the error - ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all().

if (products['name'] == "Vulli Sophie the Giraffe Teether"):
    print (products['review'])

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-15-8607777f5c3b> in <module>()
----> 1 if (products['name'] == "Vulli Sophie the Giraffe Teether"):
      2     print products['review']

C:\Users\user\Anaconda2\envs\gl-env\lib\site-packages\graphlab\data_structures\sarray.pyc in __nonzero__(self)
    752         """
    753         # message copied from Numpy
--> 754         raise ValueError("The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()")
    755 
    756     def __bool__(self):

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Edit -

  if (products['name'] == "Vulli Sophie the Giraffe Teether"):
        print products['name']
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-16-1be157eebb1a> in <module>()
----> 1 if (products['name'] == "Vulli Sophie the Giraffe Teether"):
      2     print products['name']

C:\Users\user\Anaconda2\envs\gl-env\lib\site-packages\graphlab\data_structures\sarray.pyc in __nonzero__(self)
    752         """
    753         # message copied from Numpy
--> 754         raise ValueError("The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()")
    755 
    756     def __bool__(self):

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
halfer
  • 19,824
  • 17
  • 99
  • 186

2 Answers2

0

The error message is letting you know that products['name'] is a list, therefore you cannot use a straightforward comparator.

If you want to check that any value of the products['name'] is equal to your title, you should change the condition to be:

if "Vulli Sophie the Giraffe Teether" in products['name']:
    # other stuff

Hope this helps.

Qichao Zhao
  • 191
  • 7
0

All I know of Graphlab is that it seems to use the numpy module to provide arrays... that said lets start with some data similar to yours

In [21]: import numpy as np

In [22]: prods = np.array((['a', 'b', 'c'], [1, 2, 3])) 

In [23]: prods
Out[23]: 
array([['a', 'b', 'c'],
       ['1', '2', '3']], 
      dtype='<U1')

we have a vector of names ('a', 'b', 'c') and review scores (1, 2, 3).

Next, to find the positions of a particular name you use a vectorized boolean expression like this

In [24]: prods[0] == 'b'
Out[24]: array([False,  True, False], dtype=bool)

as you can see, the result is a vector of boolean values.

The beauty of numpy is that you can address the vectors in fancy modes

In [26]: prods[1, prods[0] == 'b']
Out[26]: 
array(['2'], 
      dtype='<U1')

What I've written is, in prods select as the first index 1 (the review scores) and as the second index scan the vector of booleans and use only the True items.

What happens if you have no match? nothing

In [27]: prods[1, prods[0] == 'd']
Out[27]: 
array([], 
      dtype='<U1')

This particular value is False so you can use it like this (untested)

for my_name in names:
    my_review = products['review', products['name'] == my_name]
    if my_rev:
        do_stuff(my_review)
gboffi
  • 22,939
  • 8
  • 54
  • 85