I have a DataFrame in a variable called "myDataFrame" that looks like this:
+---------+-----+-------+-----
| Type | Count | Status |
+---------+-----+-------+-----
| a | 70 | 0 |
| a | 70 | 0 |
| b | 70 | 0 |
| c | 74 | 3 |
| c | 74 | 2 |
| c | 74 | 0 |
+---------+-----+-------+----+
I am using vectorized approach to process the rows in this DataFrame since the amount of rows I have is about 116 million.
So I wrote something like this:
myDataFrame['result'] = processDataFrame(myDataFrame['status'], myDataFrame['Count'])
In my function, I am trying to do this:
def processDataFrame(status, count):
resultsList = list()
if status == 0:
resultsList.append(count + 10000)
else:
resultsList.append(count - 10000)
return resultsList
But I get this for comparison status values:
Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
What am i missing?