I have struggles related to sorting data within groups. I have data of the average wage of nine different occupations grouped by SEX and country("GEO"). I would like to have a dataframe in which the occupations are ordered for each country and SEX following the average wage. So that I have for each country and SEX 9 occupations ordered by value. This is what I have:
df
wage Country SEX OCCUPATION
0 6 BELGIUM M Elementary
1 4 BELGIUM M POLICE
2 6 BELGIUM M MANAGERS
3 8 BELGIUM M PROFESSIONALS
2 6 BELGIUM F PROFESSOIONALS
3 8 BELGIUM F MANAGERS
4 7 BELGIUM F POLICE
5 5 FRANCE M POLICE
6 3 FRANCE M PROFESSIONALS
7 2 FRANCE M MANAGERS
But I would like to have this :
wage Country SEX OCCUPATION
1 4 BELGIUM M POLICE
0 6 BELGIUM M Elementary
2 6 BELGIUM M MANAGERS
3 8 BELGIUM M PROFESSIONALS
2 6 BELGIUM F PROFESSOIONALS
4 7 BELGIUM F POLICE
3 8 BELGIUM F MANAGERS
7 2 FRANCE M MANAGERS
6 3 FRANCE M PROFESSIONALS
5 5 FRANCE M POLICE
In the end if possible I would like to assign a number from 1:number of occupation in the order of the wage. To illustrate:
wage Country SEX OCCUPATION ORDER
1 4 BELGIUM M POLICE 1
0 6 BELGIUM M Elementary 2
2 6 BELGIUM M MANAGERS 3
3 8 BELGIUM M PROFESSIONALS 4
2 6 BELGIUM F PROFESSOIONALS 1
4 7 BELGIUM F POLICE 2
3 8 BELGIUM F MANAGERS 3
7 2 FRANCE M MANAGERS 1
6 3 FRANCE M PROFESSIONALS 2
5 5 FRANCE M POLICE 3
This question is related to:pandas groupby sort within groups. I have read this and it did not work : What I tried to achieve my desired df:
df=df.sort_values(["Country","SEX","wage"],ascending=False).groupby(["Country","SEX"])
Unfortunately python returns this instead of a dataframe:
<pandas.core.groupby.groupby.DataFrameGroupBy object at 0x0000022CC92DC668>
"GEO","SEX","occ", are all objects
obs_value is a float.
the df is a dataframe
I would be greatful if someone can help me out