Matching dataframe row order with another dataframe in Python based on str data

Question

I have 2 dataframes that both contain the same columns but are ordered differently. The first dataframe is ordered this way:

   Code  Description  Value
0  01        A         .3
1  01        B         .5
2  01        C        1.2
3  01        D         .4
4  01        E         .6
5  01        F         .9

The second dataframe is ordered this way:

   Code  Description  Value
0  12        A         .5
1  13        A         .3
2  14        A         .7
3  15        A         2.4
4  16        A         4.6
5  17        A         .6

Basically, they both contain all of the same type of content but are just ordered differently. Dataframe 1 lists all the descriptions in a particular order for a code, then moves on to the next code and repeats. Dataframe 2 lists all the codes for a description, then moves on to the next description and repeats. My question is how can I get dataframe 2 to match the same ordering pattern as dataframe 1?

What I want dataframe 2 to become:

   Code  Description  Value
0  12        A         .5
1  12        B         .6
2  12        C         .1
3  12        D         2.8
4  12        E         .4
5  12        F         .2

I was thinking that some sort of sort function might be useful here, but the main challenge is that I have to sort by description (which is a str), in a particular order that I define (in this case it is just A, B, C but can be any order) while also grouping by code.

Where are the `Value` values coming from in the output frame? — Henry Ecker, Jul 16 '21 at 02:05
Does this help? https://stackoverflow.com/questions/11067027/re-ordering-columns-in-pandas-dataframe-based-on-column-name — le_camerone, Jul 16 '21 at 02:16
the values are just numbers corresponding to each row, everything is just pulled from an excel file @HenryEcker — Jonathan Chen, Jul 16 '21 at 02:26

score 1 · Answer 1 · answered Jul 16 '21 at 02:28

1

If you set the two columns to be sorted in this order:

df.sort_values(by=['Code','Description'])

helps?

answered Jul 16 '21 at 02:28

emdemor

21
3

The only problem is I want to define how Description is sorted. – Jonathan Chen Jul 16 '21 at 02:35

score 0 · Accepted Answer · answered Jul 16 '21 at 03:12

After getting some helpful feedback I realized what I really needed was how to sort by custom list, so I found sorting by a custom list in pandas . I tried the selected answer which did not work for me but the second answer did. My implementation is below.

sorter=['A','B','C'] #whatever order I want Description to be sorted in 
df.Description = df.Description.astype("category")
df.Description.cat.set_categories(sorter, inplace=True)    
df.sort_values(by=['Code','Description'])

Matching dataframe row order with another dataframe in Python based on str data

2 Answers2