0

I have a dataframe, which consists of summary statistics of another dataframe:

df = sample[['Place','Lifeexp']]
df = df.groupby('Place').agg(['count','mean', 'max','min']).reset_index()
df = df.sort_values([('Lifeexp', 'count')], ascending=False)

When looking at the structure, the dataframe has a multi index, which makes plot creations difficult:

df.columns

MultiIndex(levels=[['Lifeexp', 'Place'], ['count', 'mean', 'max', 'min', '']],
           labels=[[1, 0, 0, 0, 0], [4, 0, 1, 2, 3]])

I tried the solutions of different questions here (e.g. this), but somehow don't get the desired result. I want df to have Place, count, mean,max, min as column names and delete Lifeexp so that I can create easy plots e.g. df.plot.bar(x = "Place", y = 'count')

smci
  • 32,567
  • 20
  • 113
  • 146
user27074
  • 627
  • 1
  • 6
  • 20
  • 1
    Well your column multiindex is `['Lifeexp', 'Place']` which looks suspiciously like it came from `df = sample[['Place','Lifeexp']]`. Check if that's the case. Show us df.columns after each of the three lines of your code. – smci May 16 '18 at 11:24

1 Answers1

1

I think solution should be simplify define column after groupby for prevent MultiIndex in columns:

df = df.groupby('Place')['Lifeexp'].agg(['count','mean', 'max','min']).reset_index()

df = df.sort_values('count', ascending=False)
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252