1

Pivoting a dataframe in pandas creates an annoying index over the column. reset_index() does not seem to get rid of the issue. Could someone help me proceed. The codebase and what I see currently are listed below

import pandas as pd

products = pd.DataFrame({'category': ['Cleaning', 'Cleaning', 'Entertainment', 'Entertainment', 'Tech', 'Tech'],
                    'store': ['Walmart', 'Dia', 'Walmart', 'Fnac', 'Dia','Walmart'],
                    'price':[11.42, 23.50, 19.99, 15.95, 55.75, 111.55],
                    'testscore': [4, 3, 5, 7, 5, 8]})

pivot_products = products.pivot(index='category', columns='store', values='price') 

print(pivot_products)

Running this code block provides the output as

store            Dia   Fnac  Walmart
category                            
Cleaning       23.50    NaN    11.42
Entertainment    NaN  15.95    19.99
Tech           55.75    NaN   111.55

when I reset index on pivot_products, it provides

store       category    Dia   Fnac  Walmart
 0           Cleaning  23.50    NaN    11.42
 1      Entertainment    NaN  15.95    19.99
 2               Tech  55.75    NaN   111.55

I really don't want the store column which shows up here - it does not capture any relevant data and ends up holding garbage values. Any ideas?

usernamenotfound
  • 1,540
  • 2
  • 11
  • 18
  • 1
    store is your column names... – BENY Nov 14 '17 at 16:44
  • 1
    Also, see https://stackoverflow.com/a/47152692/2336654 for more information on how to pivot. Pay particular attention to how the column names get placed as the names of Index objects. – piRSquared Nov 14 '17 at 16:47

1 Answers1

2

store isn't a column. It's the name of the columns object. Use pd.DataFrame.rename_axis

pivot_products.rename_axis(None, 1)

                 Dia   Fnac  Walmart
category                            
Cleaning       23.50    NaN    11.42
Entertainment    NaN  15.95    19.99
Tech           55.75    NaN   111.55
piRSquared
  • 285,575
  • 57
  • 475
  • 624