How to get rid of multilevel index after using pivot table pandas?

Question

I had following data frame (the real data frame is much more larger than this one ) :

sale_user_id    sale_product_id count
1                 1              1
1                 8              1
1                 52             1
1                 312            5
1                 315            1

Then reshaped it to move the values in sale_product_id as column headers using the following code:

reshaped_df=id_product_count.pivot(index='sale_user_id',columns='sale_product_id',values='count')

and the resulting data frame is:

sale_product_id -1057   1   2   3   4   5   6   8   9   10  ... 98  980 981 982 983 984 985 986 987 99
sale_user_id                                                                                    
1                NaN    1.0 NaN NaN NaN NaN NaN 1.0 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3                NaN    1.0 NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4                NaN    NaN 1.0 NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

as you can see we have a multililevel index , what i need is to have sale_user_is in the first column without multilevel indexing:

i take the following approach :

reshaped_df.reset_index()

the the result would be like this i still have the sale_product_id column , but i do not need it anymore:

sale_product_id sale_user_id    -1057   1   2   3   4   5   6   8   9   ... 98  980 981 982 983 984 985 986 987 99
0                          1    NaN 1.0 NaN NaN NaN NaN NaN 1.0 NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1                          3    NaN 1.0 NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2                          4    NaN NaN 1.0 NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN

i can subset this data frame to get rid of sale_product_id but i don't think it would be efficient.I am looking for an efficient way to get rid of multilevel indexing while reshaping the original data frame

As of 2020, the solution would likely be to use `reshaped_df.droplevel(1, axis=0)` ([documentation](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.droplevel.html)) but I'm not sure I fully understand the problem the selected answer solves. — mins, Dec 30 '20 at 15:53

jezrael · Accepted Answer · 2016-08-15T08:38:00.043

You need remove only index name, use rename_axis (new in pandas 0.18.0):

print (reshaped_df)
sale_product_id  1    8    52   312  315
sale_user_id                            
1                  1    1    1    5    1

print (reshaped_df.index.name)
sale_user_id

print (reshaped_df.rename_axis(None))
sale_product_id  1    8    52   312  315
1                  1    1    1    5    1

Another solution working in pandas below 0.18.0:

reshaped_df.index.name = None
print (reshaped_df)

sale_product_id  1    8    52   312  315
1                  1    1    1    5    1

If need remove columns name also:

print (reshaped_df.columns.name)
sale_product_id

print (reshaped_df.rename_axis(None).rename_axis(None, axis=1))
   1    8    52   312  315
1    1    1    1    5    1

Another solution:

reshaped_df.columns.name = None
reshaped_df.index.name = None
print (reshaped_df)
   1    8    52   312  315
1    1    1    1    5    1

EDIT by comment:

You need reset_index with parameter drop=True:

reshaped_df = reshaped_df.reset_index(drop=True)
print (reshaped_df)
sale_product_id  1    8    52   312  315
0                  1    1    1    5    1

#if need reset index nad remove column name
reshaped_df = reshaped_df.reset_index(drop=True).rename_axis(None, axis=1)
print (reshaped_df)
   1    8    52   312  315
0    1    1    1    5    1

Of if need remove only column name:

reshaped_df = reshaped_df.rename_axis(None, axis=1)
print (reshaped_df)
              1    8    52   312  315
sale_user_id                         
1               1    1    1    5    1

Edit1:

So if need create new column from index and remove columns names:

reshaped_df =  reshaped_df.rename_axis(None, axis=1).reset_index() 
print (reshaped_df)
   sale_user_id  1  8  52  312  315
0             1  1  1   1    5    1

I tried all the solutions, actulally first two solution remove sale_user_id but i need it as the first column, i need to get rid of sale_product_id. — chessosapiens, Aug 15 '16 at 08:18
i think we need to add reset_index() , reshaped_df.rename_axis(None, axis=1).reset_index() will give me exactly what i need. — chessosapiens, Aug 15 '16 at 08:35

score 9 · Answer 2 · answered Nov 24 '20 at 15:48

Make a DataFrame

import random

d = {'Country': ['Afghanistan','Albania','Algeria','Andorra','Angola']*2, 
     'Year': [2005]*5 + [2006]*5, 'Value': random.sample(range(1,20),10)}
df = pd.DataFrame(data=d)

df:

                Country         Year   Value    
1               Afghanistan     2005    6
2               Albania         2005    13
3               Algeria         2005    10
4               Andorra         2005    11
5               Angola          2005    5
6               Afghanistan     2006    3
7               Albania         2006    2
8               Algeria         2006    7
9               Andorra         2006    3
10              Angola          2006    6

Pivot

table = df.pivot(index='Country',columns='Year',values='Value')

Table:

Year    Country         2005    2006
0       Afghanistan     16      9
1       Albania         17      19
2       Algeria         11      7
3       Andorra         5       12
4       Angola          6       18

I want 'Year' to be 'index':

clean_tbl = table.rename_axis(None, axis=1).reset_index(drop=True)

clean_tbl:

    Country         2005    2006
0   Afghanistan     16      9
1   Albania         17      19
2   Algeria         11      7
3   Andorra         5       12
4   Angola          6       18

Done!

score 4 · Answer 3 · answered Mar 02 '22 at 13:05

You can also use a to_flat_index method of MultiIndex object to convert it into a list of tuples, which you can then concatenate with list comprehension and use it to overwrite the .columns attribute of your dataframe.

# create a dataframe
df = pd.DataFrame({"a": [1, 2, 3, 1], "b": ["x", "x", "y", "y"], "c": [0.1, 0.2, 0.1, 0.2]})


    a   b   c
0   1   x   0.1
1   2   x   0.2
2   3   y   0.1
3   1   y   0.2

# pivot the dataframe
df_pivoted = df.pivot(index="a", columns="b")

    c
b   x   y
a       
1   0.1 0.2
2   0.2 NaN
3   NaN 0.1

Now let's overwrite the .columns attribute and .reset_index():

df_pivoted.columns = ["_".join(tup) for tup in df_pivoted.columns.to_flat_index()]
df_pivoted.reset_index()

    a   c_x c_y
0   1   0.1 0.2
1   2   0.2 NaN
2   3   NaN 0.1

score 2 · Answer 4 · edited Nov 29 '21 at 15:57

2

We need to reset_index() to reset the index columns back into the dataframe, then rename_axis() to rename the index to None and the columns to their axis=1 (column headers) values.

reshaped_df = reshaped_df.reset_index().rename_axis(None, axis=1)

edited Nov 29 '21 at 15:57

ah bon

9,293
12
65
148

answered Oct 19 '21 at 17:47

embulldogs99

840
9
9

score 0 · Answer 5 · answered Dec 02 '21 at 16:38

Pivot from long to wide format using pivot:

import pandas
df = pandas.DataFrame({
    "lev1": [1, 1, 1, 2, 2, 2],
    "lev2": [1, 1, 2, 1, 1, 2],
    "lev3": [1, 2, 1, 2, 1, 2],
    "lev4": [1, 2, 3, 4, 5, 6],
    "values": [0, 1, 2, 3, 4, 5]})
df_wide = df.pivot(index="lev1", columns=["lev2", "lev3"], values="values")
df_wide

# lev2    1         2
# lev3    1    2    1    2
# lev1
# 1     0.0  1.0  2.0  NaN
# 2     4.0  3.0  NaN  5.0

Rename the (sometimes confusing) axis names

df_wide.rename_axis(columns=[None, None])

#         1         2
#         1    2    1    2
# lev1
# 1     0.0  1.0  2.0  NaN
# 2     4.0  3.0  NaN  5.0

score -1 · Answer 6 · answered May 23 '19 at 06:20

-1

The way it works for me is

df_cross=pd.DataFrame(pd.crosstab(df[c1], df[c2]).to_dict()).reset_index()

answered May 23 '19 at 06:20

Yury Wallet

1,474
1
13
24

How to get rid of multilevel index after using pivot table pandas?

6 Answers6

Linked