1

I have below dataframe:

week           Country         Year   Value   category  
1               Angola          2005    6       A  
1               Angola          2005    13      A  
1               Angola          2005    10      B  
1               Angola          2005    11      C  
1               Angola          2005    5       D  
1               Angola          2006    3       A  
1               Angola          2006    2       B  
1               Angola          2006    7       C  
1               Angola          2006    3       D  
1               Angola          2006    6       E  
2               Angola          2006    3       A  
2               Angola          2006    2       B  
2               Angola          2006    7       C  
2               Angola          2006    3       D  

I want the above data frame to group by week in each year and prepare a pivot in python that gives below data frame. The same data continues for different countries and different categories might be added, if there is new category in any country then that category should be added to the data frame with the respective values in that data frame.

week      Country         year    category_A   Category_B     category_C category_D    Category_E
1           Angola         2005       19            10              11           5          0
1           Angola         2006       3             2               7           3           6
2           Angola         2006       3             2               7           3           0
jpp
  • 159,742
  • 34
  • 281
  • 339

2 Answers2

3

This is one way. Note you can use reset_index if you need your index as columns. You may also wish to rename your category columns as a separate step.

res = df.pivot_table(index=['week', 'Country', 'Year'], columns=['category'],
                     values='Value', aggfunc=np.sum, fill_value=0)

print(res)

category            A   B   C  D  E
week Country Year                  
1    Angola  2005  19  10  11  5  0
             2006   3   2   7  3  6
2    Angola  2006   3   2   7  3  0
jpp
  • 159,742
  • 34
  • 281
  • 339
  • Hi Jpp, in the category column if there is null(NAN) and i want that to be in my pivot list. But if i do just columns=['category'] then i will miss the values of NAN. Hoe do i overcome this. Thanks –  May 21 '18 at 01:36
2

In this case, you can also use groupby instead of pivot:

(df.groupby(['week', 'Year', 'Country', 'category'])
 .sum()
 .unstack(fill_value=0)
 .add_prefix('category_'))

Which gives you:

                  category_Value                                            
category              category_A category_B category_C category_D category_E
week Year Country                                                           
1    2005 Angola              19         10         11          5          0
     2006 Angola               3          2          7          3          6
2    2006 Angola               3          2          7          3          0
sacuL
  • 49,704
  • 8
  • 81
  • 106