I am trying to convert such a df:
df = pd.DataFrame({'A': ['A1', 'A1', 'A1', 'A1', 'A1', 'A1', 'A2', 'A2', 'A2', 'A2', 'A2', 'A2', 'A2'],
'B': ['B1', 'B1', 'B2', 'B2', 'B3', 'B3', 'B4', 'B5', 'B6', 'B7', 'B7', 'B8', 'B8']})
by taking n (here 2) largest indexes (by count of B) to:
My way of doing it:
df = df.groupby(['A', 'B'])['A'].count()
df = df.groupby(level=0).nlargest(2).reset_index(level=0, drop=True)
what gives me (which is close to what I need):
Now, the only methods I know to transform MultiIndex are:
df.reset_index(level=1)
df.unstack()
But they don't give me what I am looking for. Is there any dataframe method that will do it for me or I need to do it around with apply. One way of doing it would be to loop through every pair of: df.index.get_level_values(level=1)
and putting it to new df of 2 columns. But this will break If one index.level=0, will have only one index.level=1
Also: I don't care for order of (nlargest) when the count is the same.