0

Let's assume we have a df like below:

    A   B
0   a   101
1   b   101
2   a   101
3   d   101
4   c   102
5   a   102
6   b   102
7   b   102
8   b   102

Now, what I want to do is to create df with size # of unique values of A by # of unique values on B + 1. For this case it'd be 4x3, and fill the df by the counts of the elements in A within a group in B like follows:

    item 101 102
0   a     2   1
1   b     1   3
2   c     0   1
3   d     1   0

I know there is a way in R, but I need to do it in python. I've done it by some sort of counting loops, but that's not the elegant way and neither pythonic.

So, how can I do achieve this by groupby and aggregate or any other ideas ?

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
null
  • 1,944
  • 1
  • 14
  • 24

1 Answers1

1

Use crosstab:

import pandas as pd

data = [['a',   101],
['b',   101],
['a',   101],
['d',   101],
['c',   102],
['a',   102],
['b',   102],
['b',   102],
['b',   102]]

df = pd.DataFrame(data=data, columns=['A', 'B'])

result = pd.crosstab(df.A, df.B)
print(result.reset_index())

Output

B  A  101  102
0  a    2    1
1  b    1    3
2  c    0    1
3  d    1    0
Dani Mesejo
  • 61,499
  • 6
  • 49
  • 76