1

I am trying to add a new column, category, by the existing category's id.

conditions = [
    (result['id'] == 1362) or (result['id'] == 7463),
    (result['id'] == 543) or (result['id'] == 3424)]
choices = ['A1', 'A2']
result['category'] = np.select(conditions, choices, default='black')

But, I got an error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-289-7cb2bbdaab53> in <module>()
      1 conditions = [
----> 2     (result['id'] == 1362) or (result['id'] == 7463),
      3     (result['id'] == 543) or (result['id'] == 3424)]
      4 choices = ['A1', 'A2']
      5 result['category'] = np.select(conditions, choices, default='black')

/anaconda/lib/python3.6/site-packages/pandas/core/generic.py in __nonzero__(self)
    951         raise ValueError("The truth value of a {0} is ambiguous. "
    952                          "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
--> 953                          .format(self.__class__.__name__))
    954 
    955     __bool__ = __nonzero__

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

How can I correct this?

Mr. T
  • 11,960
  • 10
  • 32
  • 54
chu
  • 79
  • 2
  • 6
  • Does this answer your question? [Pandas: How do I assign values based on multiple conditions for existing columns?](https://stackoverflow.com/questions/30631841/pandas-how-do-i-assign-values-based-on-multiple-conditions-for-existing-columns) – iacob Mar 26 '21 at 17:16

1 Answers1

0

With pandas you need to use elementwise logical or | or np.logical_or():

import numpy as np
import pandas as pd

d = {"id": [1362, 1361, 7463, 7462, 543, 542, 3424, 3333]}
result = pd.DataFrame(d)

conditions = [np.logical_or(result['id'] == 7463, result['id'] == 1362),
              np.logical_or(result['id'] == 3424, result['id'] == 543)]

#Alternate syntax
#conditions = [(result['id'] == 7463) | (result['id'] == 1362),
#              (result['id'] == 3424) | (result['id'] == 543)]

choices = ['A1', 'A2']
result['category'] = np.select(conditions, choices, default='black')

print(result)

     id category
0  1362       A1
1  1361    black
2  7463       A1
3  7462    black
4   543       A2
5   542    black
6  3424       A2
7  3333    black
iacob
  • 20,084
  • 6
  • 92
  • 119
  • Thanks for the reply. But, if I add one more condition (e.g., result['id'] == 1362, result['id'] == 7463, result['id'] == 1361), it doesn't work. This "np.logical_or" seems for only 2 values comparison, if I understand correctly... What should I do, if I have more than 2 conditional values? – chu Jun 17 '18 at 13:26
  • @chu You can nest them e.g. `np.logical_or(result['id'] == 7463, np.logical_or(result['id'] == 1361, result['id'] == 1362))` – iacob Jun 17 '18 at 13:52