0

I have code below that gives me number of times a professor has a value for true or false ( true being 1 ) for various columns as:

df below as:

Name | Factory | Restaurant | Store | Building
Brian    True    False        True     False
Mike     True    True         True     True
Brian    True    False        False    True
Sam      False   False        False    False
Sam      True    False        True     True
Mike     True    False        False    False


cols = ['Factory', 'Restaurant', 'Store', 'Building'] ( because df has other columns I dont want to calc)


df =  df.groupby('Name', as_index=False)[cols].sum()

This gives me below:

Name | Factory | Restaurant | Store | Building
Brian    2          0           1        1
Mike     2          1           1        1
Sam      1          0           1        1

If I have another column in same df called status below like:

status
open
closed
open
closed

In the same df but I also want to group and calc the number of times it occurs by either open or closed how can I edit my code?

Basically output would have what I already have except it would say

 Name  Factory Open | Factory Closed... and so on 
    Brian   2               4
    Mike    0               1

Thanks

Chris90
  • 1,868
  • 5
  • 20
  • 42
  • Does this answer your question? [How to pivot a dataframe](https://stackoverflow.com/questions/47152691/how-to-pivot-a-dataframe) – G. Anderson Sep 23 '20 at 17:37
  • Hmm not sure - so are you saying to pivot df so I can have status column values as col? – Chris90 Sep 23 '20 at 17:42
  • I have code above how I get current output values df.groupby('Name', as_index=False)[cols].sum() – Chris90 Sep 23 '20 at 17:53

1 Answers1

3

The output that you seek can be done but would take some finagling in joining the column values AND column names. Don't know if it answers your question, but I would prefer to do it like:

# sample data

Name     Factory Restaurant   Store    Building Status
Brian    True    False        True     False    open
Mike     True    True         True     True     closed
Brian    True    False        False    True     open
Sam      False   False        False    False    closed
Sam      True    False        True     True     open
Mike     True    False        False    False    closed

Groupby both Name and Status:

df1 = df.groupby(['Name', 'Status']).sum()
print(df1)

              Factory  Restaurant  Store  Building
Name  Status
Brian open          2           0      1         1
Mike  closed        2           1      1         1
Sam   closed        0           0      0         0
      open          1           0      1         1

If you wanted another way to consume the data you could unstack the now MultiIndex df1:

df2 = df1.unstack(1, fill_value=0)

print(df2)
        Factory         Restaurant    Store         Building
Status  closed open     closed open   closed open   closed open
Name
Brian        0    2          0    0        0    1        0    1
Mike         2    0          1    0        1    0        1    0
Sam          0    1          0    0        0    1        0    1

Ukrainian-serge
  • 854
  • 7
  • 12