def comment (row):
if row['STATUS'] == "CANCELLED":
return "Cancelled"
elif strToDate(row['PROCESS_DATE']) < datetime(2018,1,1) or strToDate(row['PROCESS_DATE']) > datetime(2018,2,1):
return "Date out of Range"
elif "Lost" in str(row['NOTE']) or "Stolen" in str(row['TRADE_NOTE_TXT']):
return 'Lost or Stolen'
else:
return 'Other'
df['Comment'] = ''
for i, row in df.iterrows():
df.at[i,"Comment"] = comment(row)
I use the following above code to change the value of df['Comment'] based on these conditions. However when I do df.count() it shows there are 7790 values in comment.
However when I do df.groupby('Comment').size() The out put is as follows, which is much greater than the number of comments that should even be present.
Comment
Cancelled 1171
Date out of Range 1175
Lost or Stolen 634
Other 4810
dtype: int64