0

Sorry for asking a basic question, but I am learning Python on my own and appreciate if someone could help me understand this...

cframe = frame[frame.a.notnull()]
operating_system = np.where(cframe['a'].str.contains('Windows'), 'Windows', 'Not Windows')
by_tz_os = cframe.groupby(['tz', operating_system])

# Here I have a problem... 
agg_counts = by_tz_os.size().unstack().fillna(0)                                         
print(agg_counts[:10])

agg_counts2 = by_tz_os.count().unstack().fillna(0)
print(agg_counts2[:10])

I thought that the result of agg_counts2 (result2) will be the same as that of agg_counts (result1), but it is not.
I can not understand why and appreciate if anyone could help.

result1 :
enter image description here

result2 :
enter image description here

1 Answers1

0

Size returns the number of rows times number of columns if DataFrame. I suggest you check the documentation of pandas commands from the website.

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.size.html

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.count.html

Please also consider posting your original dataframe (or a sample), so that answers can be more specific and helpful to you.

Neo
  • 627
  • 3
  • 7
  • 2
    In this case `size` is called on `groupby`, so it returns the number of rows in a group https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.core.groupby.GroupBy.size.html#pandas.core.groupby.GroupBy.size – gereleth Oct 18 '19 at 14:11
  • @gereleth Correct, I did not look carefully enough as the original dataframe is not described. – Neo Oct 18 '19 at 14:12
  • Neofytos and gereleth, Thank you so much. I now understood. I will accept the answer as soon as I am allowed to do so (still a few more minutes to accpet). Thank you very much. – user12010773 Oct 18 '19 at 14:16
  • 1
    This is strange because this answer doesn't deal with your stated question at all)). If you figured something out for yourself maybe you should post another answer or delete this question. – gereleth Oct 18 '19 at 14:22
  • Gereleth, May I clarify what I thought I understood? I was expecting to see the exactly the same result when I tried both "agg_counts" and "agg_counts2". But I understood that... "agg_counts2" shows the number of entries for "each" column/row, (as I used, "groupby", the numbers are grouped by "windows" or "not windows" as well). On the other hand, as size() returns the number of rows " times" number of columns if DataFrame. That means... it will not show the numbers of elements for each column/row. In this example, as I used "groupby", the numbers are counted for "Windows" and "Not Windows". – user12010773 Oct 18 '19 at 14:38
  • Gereleth, thank you for your comment... and If what I understood (what I wrote above) is not correct, I appreciate if you could correct me... Thank you very much! – user12010773 Oct 18 '19 at 14:39
  • there definitely isn't any "rows times columns" happening. Please see the linked question for an explanation @user12010773 – gereleth Oct 18 '19 at 14:50