3

I am using the following code,

if(df.month == 3 or df.month == 4 or df.month == 5):
    df.test = 'A'
elif(df.month == 6 or df.month == 7 or df.month == 8):
    df.test = 'B'
else:
    df.test = 'C'

But while using this, I am getting the following error,

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Update:

print df.columns

Unnamed: 0      int64
year            int64
month           int64
day             int64
dep_time      float64
dep_delay     float64
arr_time      float64
arr_delay     float64
carrier        object
tailnum        object
flight          int64
origin         object
dest           object
air_time      float64
distance        int64
hour          float64
minute        float64


print df.dtypes

dtype: object

Can anybody help me in finding the error here?

jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
haimen
  • 1,985
  • 7
  • 30
  • 53

5 Answers5

2

I think the best is use loc and isin, because you can't compare a scalar with an array like that using if or elif it becomes ambiguous:

print df

   year  month  day
0  2005      3   20
1  2005      4   20
2  2005      5   20
3  2005      6   20
4  2005      7   20
5  2005      8   20
6  2005      9   20

df['test'] = 'C'
df.loc[df['month'].isin([3,4,5]) , 'test'] = 'A'
df.loc[df['month'].isin([6,7,8]) , 'test'] = 'B'

print df  

   year  month  day test
0  2005      3   20    A
1  2005      4   20    A
2  2005      5   20    A
3  2005      6   20    B
4  2005      7   20    B
5  2005      8   20    B
6  2005      9   20    C

Or you can fill column test by value C this way:

df.loc[df['month'].isin([3,4,5]) , 'test'] = 'A'
df.loc[df['month'].isin([6,7,8]) , 'test'] = 'B'
df.loc[df['month'].isin([1,2,9,10,11,12]) , 'test'] = 'C'

print df    

   year  month  day test
0  2005      3   20    A
1  2005      4   20    A
2  2005      5   20    A
3  2005      6   20    B
4  2005      7   20    B
5  2005      8   20    B
6  2005      9   20    C
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
0

You can use a comprehension to create your test column:

>>> df = pd.DataFrame({'month' : pd.Series(range(1,13))})
>>> df['test'] = ['A' if m in [3,4,5] else 
...               'B' if m in [6,7,8] else 
...               'C' for m in df['month']]
>>> df
    month test
0       1    C
1       2    C
2       3    A
3       4    A
4       5    A
5       6    B
6       7    B
7       8    B
8       9    C
9      10    C
10     11    C
11     12    C

Or you can apply a function, which produces the same result:

>>> def value(month):
...     if month in [3,4,5]:
...         return 'A'
...     if month in [6,7,8]:
...         return 'B'
...     return 'C'
>>> df['test'] = df['month'].apply(value)
AChampion
  • 29,683
  • 4
  • 59
  • 75
0

Try

def valuesetter(x):
    if x in [3,4,5]: return "A"
    elif x in [6,7,8]: return "B"
    else: return "C"

df["test"] = list(map(valuesetter,df.month))
Matthew
  • 7,440
  • 1
  • 24
  • 49
0

The exception message you're getting is pretty self explanatory. df['month'] is a series, and the truth value of a series is ambiguous because it represents a series of truth values. You can do what you're trying to do with pd.Series.map

def assignmentFunction(value):
    if value in [3, 4, 5]:
        return 'A'
    elif value in [6, 7, 8]:
        return 'B'
    else:
        return 'C'

df['test'] = df['month'].map(assignmentFunction)
Thtu
  • 1,992
  • 15
  • 21
-1

This answer mainly tries to explain the error that you're seeing. As I'm not a pandas user, I'll let the other answers speak to better ways to write this code...


df.month returns an array. some_array == 6 will return another array (constructed such that new_array[i] == True iff some_array[i] == 6).

Because of situations like this, in numpy, an array does not have a truth value (unlike normal python sequences). So, to test if an array is truthy, you need to specify what you mean. e.g. to specify that all elements must be truthy, you'd want: (df.month == 6).all()

mgilson
  • 300,191
  • 65
  • 633
  • 696
  • while using all command, the error gets solved. But the if condition doesn't work. So this solution didn't work for this particular problem – haimen Jan 13 '16 at 06:36