Python-pandas counting by groupby inaccuracy

Question

def comment (row):
    if row['STATUS'] == "CANCELLED":
        return "Cancelled"
    elif  strToDate(row['PROCESS_DATE']) < datetime(2018,1,1) or strToDate(row['PROCESS_DATE']) > datetime(2018,2,1):
        return "Date out of Range"
    elif "Lost" in str(row['NOTE']) or "Stolen" in str(row['TRADE_NOTE_TXT']):
        return 'Lost or Stolen'
    else:
        return 'Other'

df['Comment'] = ''

for i, row in df.iterrows():
    df.at[i,"Comment"] = comment(row)

I use the following above code to change the value of df['Comment'] based on these conditions. However when I do df.count() it shows there are 7790 values in comment.

However when I do df.groupby('Comment').size() The out put is as follows, which is much greater than the number of comments that should even be present.

     Comment
     Cancelled            1171
     Date out of Range    1175
     Lost or Stolen       634
     Other                4810
     dtype: int64

Those numbers add up to the correct amount, I do not see the issue — rahlf23, Jun 29 '18 at 19:58
https://stackoverflow.com/questions/33346591/what-is-the-difference-between-size-and-count-in-pandas — BENY, Jun 29 '18 at 20:02

score 1 · Answer 1 · answered Jun 29 '18 at 19:53

1

Maybe I am confused as to what it is you're asking but those numbers add up:

1171 + 1175 + 634 + 4810 = 7790

Meaning that df.count() and df.groupby('Comment').size() represent the same number of rows.

answered Jun 29 '18 at 19:53

tobsecret

2,442
15
26

score -1 · Answer 2 · edited Jun 29 '18 at 20:20

-1

You need to first properly indent your code under the def comment(row): function to get the answer you expect.

edited Jun 29 '18 at 20:20

rahlf23

8,869
4
24
54

answered Jun 29 '18 at 19:41

Keith W

43
2
12

Sorry that was a copy and paste error, it is indented properly – Rebecca Gonzalez Jun 29 '18 at 19:44
I was trying to comment but it would not let me. Was Just trying to help, not get downrated. – Keith W Jun 29 '18 at 19:55

Python-pandas counting by groupby inaccuracy

2 Answers2