I'm modifying a csv file using Python Pandas. I am fairly new to this and am experimenting Pandas as an alternative for Excel regarding data handling and manipulation.
Now I run into a problem trying to conditionally change the value of a cell in column df.duration
based upon the value of a cell on the same row in column df.paymenttype
.
So I've tried modifying the value in df.duration
using the .loc method.
df.loc[df.paymenttype == 'cash', df.duration] = (df.duration % 1)
It gives the expected outcome and works fine. However, in this case the outcome of df.duration % 1
returns an unwanted value 0.0 for certain rows. It is mathematically correct but in case df.duration % 1
returns 0.0 I want to set the value of df.duration
to 1.
So I thought I might be able to do something like this:
df.loc[df.paymenttype == 'cash', df.duration] = 1 if df.duration % 1 == 0 else (df.duration % 1)
This however returns: ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
.
Now I am wondering two things:
- Why is this ValueError raised and how could I fix this?
I could and should be doing more research on this myself before dropping this question here and I will. But more importantly and for future projects (since I am fairly new to Python and Pandas):
- I am now wondering whether the
.loc
method is the right way to conditionally change the values for column cells in general and in this certain case where I want to add a conditional statement when setting the value.