0

I have struggles related to sorting data within groups. I have data of the average wage of nine different occupations grouped by SEX and country("GEO"). I would like to have a dataframe in which the occupations are ordered for each country and SEX following the average wage. So that I have for each country and SEX 9 occupations ordered by value. This is what I have:

df

  wage   Country SEX  OCCUPATION
0   6   BELGIUM  M    Elementary
1   4   BELGIUM  M    POLICE 
2   6   BELGIUM  M    MANAGERS
3   8   BELGIUM  M    PROFESSIONALS
2   6   BELGIUM  F    PROFESSOIONALS     
3   8   BELGIUM  F    MANAGERS
4   7   BELGIUM  F    POLICE
5   5   FRANCE   M    POLICE
6   3   FRANCE   M    PROFESSIONALS
7   2   FRANCE   M    MANAGERS

But I would like to have this :

  wage   Country SEX  OCCUPATION
1   4   BELGIUM  M    POLICE 
0   6   BELGIUM  M    Elementary
2   6   BELGIUM  M    MANAGERS
3   8   BELGIUM  M    PROFESSIONALS
2   6   BELGIUM  F    PROFESSOIONALS     
4   7   BELGIUM  F    POLICE
3   8   BELGIUM  F    MANAGERS
7   2   FRANCE   M    MANAGERS
6   3   FRANCE   M    PROFESSIONALS
5   5   FRANCE   M    POLICE

In the end if possible I would like to assign a number from 1:number of occupation in the order of the wage. To illustrate:

  wage   Country SEX  OCCUPATION     ORDER
1   4   BELGIUM  M    POLICE            1
0   6   BELGIUM  M    Elementary        2
2   6   BELGIUM  M    MANAGERS          3
3   8   BELGIUM  M    PROFESSIONALS     4
2   6   BELGIUM  F    PROFESSOIONALS    1
4   7   BELGIUM  F    POLICE            2
3   8   BELGIUM  F    MANAGERS          3
7   2   FRANCE   M    MANAGERS          1
6   3   FRANCE   M    PROFESSIONALS     2
5   5   FRANCE   M    POLICE            3

This question is related to:pandas groupby sort within groups. I have read this and it did not work : What I tried to achieve my desired df:

df=df.sort_values(["Country","SEX","wage"],ascending=False).groupby(["Country","SEX"])

Unfortunately python returns this instead of a dataframe:

<pandas.core.groupby.groupby.DataFrameGroupBy object at 0x0000022CC92DC668>

"GEO","SEX","occ", are all objects
obs_value is a float.
the df is a dataframe

I would be greatful if someone can help me out

Pat
  • 37
  • 1
  • 5
  • 1
    Welcome to StackOverflow. Please take the time to read this post on [how to provide a great pandas example](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) as well as how to provide a [minimal, complete, and verifiable example](http://stackoverflow.com/help/mcve) and revise your question accordingly. These tips on [how to ask a good question](http://stackoverflow.com/help/how-to-ask) may also be useful. – jezrael Apr 04 '19 at 10:15
  • 1
    [Please don't post images of code/data (or links to them)](http://meta.stackoverflow.com/questions/285551/why-may-i-not-upload-images-of-code-on-so-when-asking-a-question) – jezrael Apr 04 '19 at 10:15
  • 1
    Only `df=df.sort_values(["Country","SEX","wage"],ascending=False)` should work for you. No need of `groupby` – Sociopath Apr 04 '19 at 11:45

0 Answers0