Using GroupBy and value_counts

Question

I need some help with pandas. I have a DataFrame with a column of unique IDs and each ID has a few different application downloaded.

ID | AppID | Count
1  |  A    |   2
2  |  A    |   3
3  |  B    |   1
4  |  D    |   1
5  |  E    |   5

I am trying to groupby the ID and count the total number of appID for each ID.

Expected output:

ID | A | B | C | D | E....
1  | 2 | 0 | 1 |  8 |  5
2  | 3 | 6 | 7 |  4 |  6  
3  | 9 | 1 | 2 |  5 |  7
4  | 3 | 8 | 4 |  1 |  3
5  | 1 | 1 | 3 |  5 |  5

The code that I have tried are

t = df.groupby(['ID']).agg({i:'value_counts' for i in df.columns[1:]})

and

pd.crosstab(index=t['ID'], columns=t['count'])

The results I gotten

ID | AppID | Count
1  |  A    |   2
1  |  B    |   0
1  |  C    |   1
1  |  D    |   8
1  |  E    |   5

2  |  A    |   3
2  |  B    |   6
2  |  C    |   7
2  |  D    |   4
2  |  E    |   6

in your expected output, the value for `ID=1, AppID='A'` is `1`. That doesn't seem to match the input you give as example. — Pierre D, Dec 08 '20 at 04:05
Also: are you looking for the `sum(Count)` or just the number of rows? — Pierre D, Dec 08 '20 at 04:06

score 0 · Accepted Answer · answered Dec 08 '20 at 04:10

If you are looking to sum up the Count values, try:

df.groupby(['ID', 'AppID'])['Count'].sum().unstack(fill_value=0)

If instead you want the number of rows (the number of times each AppID appears for a given ID), regardless of your Count column, then try instead:

df.groupby(['ID', 'AppID']).count().unstack(fill_value=0)

In both cases, the value is established much like your original solution (but using only vectorized ops) and then turned into a wide df by using .unstack().

I didn't know about ```.unstack()```. Thanks a lot for the help! — asgasega, Dec 08 '20 at 05:11

Using GroupBy and value_counts

1 Answers1