How to transform all values in a column of a pandas dataframe into new columns with counts?

Question

I have the following pandas DataFrame:

df = pd.DataFrame({"person": ["Joe", "Jack", "Thomas", "Jack", "Joe", "Thomas"],
                   "fruit": ["apple", "apple", "banana", "orange", "banana", "banana"]})

I want to transform this to a new DataFrame where each row is a person, and a column with a count for each type of fruit (the number of different fruit types is very small and known beforehand).

The desired output in this example would look like:

person    apple    banana    orange
Joe       1        1         0
Jack      1        0         1
Thomas    0        2         0

This looks so simple and I know it has something to do with pivoting, but I could not find any example online that works in my case. How to solve this?

Thanks in advance

[crosstab](https://pandas.pydata.org/docs/reference/api/pandas.crosstab.html) — Dani Mesejo, Oct 03 '21 at 15:48
`pd.crosstab(df['person'], df['fruit'])`[like this answer](https://stackoverflow.com/a/39132761/15497888) or `df.groupby(['person', 'fruit']).size().unstack(fill_value=0)`[like this answer](https://stackoverflow.com/a/39132900/15497888) — Henry Ecker, Oct 03 '21 at 17:06

score 0 · Accepted Answer · answered Oct 03 '21 at 16:17

This should do the job:

df = df.groupby(['person', 'fruit'])['fruit'].count().to_frame()
df = pd.crosstab(index=df.index.get_level_values('person'), 
                 columns=df.index.get_level_values('fruit'), 
                 aggfunc='sum', 
                 values=df.values, 
                 rownames=['person'], 
                 colnames=['fruit']).fillna(0)

How to transform all values in a column of a pandas dataframe into new columns with counts?

1 Answers1