I have a dataframe (it is the product of using the pivot function, which is why it has the c and a):
c 367235 419895 992194
a
1999-02-06 Nan 9 Nan
2000-04-03 2 Nan Nan
1999-04-12 Nan Nan 4
1999-08-08 2 Nan Nan
1999-11-01 8 5 1
1999-12-08 Nan 3 Nan
1999-08-17 Nan Nan 10
1999-10-22 Nan 3 Nan
1999-03-23 Nan 4 Nan
2000-09-29 9 Nan Nan
1999-04-30 Nan Nan 1
2000-09-30 9 Nan Nan
I would like to add a new row at the bottom of this dataframe. Each cell in the new row will evaluate the column above it; if the column contains the numbers 9, 8 or 3, the cell will evaluate to "TRUE". If the column does not contain those numbers, the cell will evaluate to "FALSE". Ultimately, my goal is to delete the columns with a "FALSE" cell using the drop function, creating a dataset like so:
c 367235 419895
a
1999-02-06 Nan 9
2000-04-03 2 Nan
1999-04-12 Nan Nan
1999-08-08 2 Nan
1999-11-01 8 5
1999-12-08 Nan 3
1999-08-17 Nan Nan
1999-10-22 Nan 3
1999-03-23 Nan 4
2000-09-29 9 Nan
1999-04-30 Nan Nan
2000-09-30 9 Nan
TRUE TRUE
My problem:
I can write a function that evaluates if one of several numbers are in a list, but I cannot write this function into .apply.
That is, I found that this works for determining if a group of numbers is in a list:
I tried to modify it as follows for the apply function:
def BIS(i):
L1 = [9,8,3]
if i in L1:
return "TRUE"
else:
return "FALSE"
df_wanted.apply(BIS, axis = 0)
this results in an error: ('the truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item, a.any().' u'occured at index 367235')
This makes me think that although .apply takes an entire column as input, it cannot aggregate the truth value of all the individual cells and come up with a total truth value about the column. I looked up a.any and a.bool, and they look very useful, but I don't know where to stick them in? For example, this didn't work:
df_wanted.apply.any(BIS, axis = 0)
nor did this
df_wanted.apply(BIS.any, axis = 0).
Can anyone point me in the right direction? Many thanks in advance