2

I can't figure out for the life of me how to do the following in Pandas:

I have input data akin to:

| ticket | client |   status    |
+--------+--------+-------------+
|      1 | A      | complete    |
|      2 | A      | complete    |
|      3 | A      | in progress |
|      4 | B      | in progress |
|      5 | B      | in progress |
+--------+--------+-------------+

I'd like output like this:

| client | #complete | #in progress |
+--------+-----------+--------------+
| A      |         2 |            1 |
| B      |         0 |            2 |
+--------+-----------+--------------+

I've messed around with groupby, but then you get a multi-index where complete and in progress are part of the index (or using reindex, again there's a status column). But what I want (as the example shows) is to create a column per unique value from the original status column (and then count/groupby).

If generating a column for every unique value is not possible automatically, I can also manually write a line of code for each unique status (there's a limited amount of them and they are known), but even then my question is still how to achieve that.

RoninTDK
  • 33
  • 4

2 Answers2

1

Let us try

s = pd.crosstab(df.client, df.status).reset_index()
BENY
  • 317,841
  • 20
  • 164
  • 234
1

Another way is to groupby, value_counts and unstack status

 df1.groupby('client')['status'].value_counts().unstack('status')\
      .reset_index(drop=False).rename_axis(index=None, columns=None)



 client  complete  inprogress
0      A       2.0         1.0
1      B       NaN         2.0
wwnde
  • 26,119
  • 6
  • 18
  • 32