How to change a dataframe to count the variables

Question

I have a dataframe like this:

df = pd.DataFrame({"X":['a', 'b', 'c', 'b', 'b', 'a'],
                   "Y":['A', 'B', 'A', 'C', 'A', 'A']})

Which method can I use to count the similar values and change it like this:

score 1 · Answer 1 · answered Oct 30 '20 at 16:41

1

May be you can try crosstab (documentation):

pd.crosstab(df.Y, df.X)

Result:

answered Oct 30 '20 at 16:41

niraj

17,498
4
33
48

score 0 · Answer 2 · answered Oct 31 '20 at 17:10

You can use pivot_table method, the first param is the data source, I would think columns parameter, index parameter and fill_value parameter don't need explanation however, I want to clarify something aggfunc parameter: 'size' will include nan values, if you set this parameter as 'count', then only will count no-nan values. You can read more about that in this answer.

pd.pivot_table(df, columns='X', index='Y', fill_value=0, aggfunc='size')

Output:

How to change a dataframe to count the variables

2 Answers2