0

I have a data frame with 3 columns. I would like to group by the date and have the WinID as two columns and count in the cells. This is my dataframe:

    date        userid  extid   count
0   2019-01-16  1       4624    34
1   2019-01-16  1       4625    2
2   2019-01-16  2       4624    10
3   2019-01-16  2       4625    0
4   2019-01-17  1       4624    32
5   2019-01-17  1       4625    0
6   2019-01-17  2       4624    17
7   2019-01-17  2       4625    1
8   2019-01-18  1       4624    27
9   2019-01-18  1       4625    1
10  2019-01-18  2       4624    27
11  2019-01-18  2       4625    3
12  2019-01-21  1       4624    20
13  2019-01-21  1       4625    0
14  2019-01-21  2       4624    17
15  2019-01-21  2       4625    2

My desired output:

    date        userid 4624  4625
0   2019-01-16  1      34    2
1   2019-01-16  2      10    0
2   2019-01-17  1      32    0
3   2019-01-17  2      17    1
4   2019-01-18  1      27    1      
5   2019-01-18  2      27    3
6   2019-01-21  1      20    0
7   2019-01-21  2      17    2

I have tried transpose and pivot tables but have not been able to accomplish this.

sectechguy
  • 2,037
  • 4
  • 28
  • 61

1 Answers1

1

You can pivot_table with aggfunc='first'. pivot is conceptually correct, except it only supports a single scalar index argument.

res = pd.pivot_table(df, index=['date', 'userid'], columns='extid',
                     values='count', aggfunc='first').reset_index()

print(res)

extid        date  userid  4624  4625
0      2019-01-16       1    34     2
1      2019-01-16       2    10     0
2      2019-01-17       1    32     0
3      2019-01-17       2    17     1
4      2019-01-18       1    27     1
5      2019-01-18       2    27     3
6      2019-01-21       1    20     0
7      2019-01-21       2    17     2
jpp
  • 159,742
  • 34
  • 281
  • 339
  • I am getting: TypeError: cannot insert an item into a CategoricalIndex that is not already an existing category @jpp – sectechguy Jan 29 '19 at 20:52
  • @sectechguy, Can't replicate, looks like you have a series of categories somewhere which needs more care. For this, provide a **[mcve]**. See **[How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples)** if you need help on how to do this. – jpp Jan 29 '19 at 20:53