-1

Suppose that I want to sort a data frame in Pandas and my data frame looks like this

   First Name    Last Name    Street        Address Type

0  Joe           Smith        123 Main St.  Property Address
1  Gregory       Stanton      124 Main St.  X Old Property Address
2  Phill         Allen        38 Maple St.  Alternate Address
3  Joe           Smith        PO Box 3165   Alternate Address
4  Xi            Dong         183 Main St.  Property Address
5  Phill         Allen        128 Main St.  Property Address

I want to first sort the data frame by last name so that it will look like this:

   First Name    Last Name    Street        Address Type

0  Phill         Allen        38 Maple St.  Alternate Address
1  Phill         Allen        128 Main St.  Property Address
2  Xi            Dong         183 Main St.  Property Address
3  Joe           Smith        123 Main St.  Property Address
4  Joe           Smith        PO Box 3165   Alternate Address
5  Gregory       Stanton      124 Main St.  X Old Property Address

Now for each person I want the property address to become before the alternate address (If the person has both a property and alternate address) so that the dataframe would look like this:

   First Name    Last Name    Street        Address Type

0  Phill         Allen        128 Main St   Property Address
1  Phill         Allen        38 Maple St.  Alternate Address
2  Xi            Dong         183 Main St.  Property Address
3  Joe           Smith        123 Main St.  Property Address
4  Joe           Smith        PO Box 3165   Alternate Address
5  Gregory       Stanton      124 Main St.  X Old Property Address

Notice that Phill Allen's entries got switched in the above data frame because his alternate address came before his property address. My code looks like this:

duplicates = df[df.duplicated(['Last Name'], keep=False)]
duplicates = duplicates.sort_values(['Last Name'], ascending = True)
duplicates = duplicates.sort_values(['Address Type'], ascending = True)

I have already tried using

duplicates = df.sort_values(['last', 'Address Type'], ascending = True) 

This does not work because the Address Type can be many different things not just primary/alternate and this code will not necessarily always work when sorted in ascending/descending order.

But it does not switch the property address and alternate address in the correct order because python first sorts the dataframe by Last Name then resorts it based on Address Type. I am looking for code that will sort by last name first and based on those last names, then sort by address type. Any help would be appreciated. Thanks!

Sterling
  • 1
  • 1
  • To generalize, you could turn `Address Type` into an ordered categorical, then sort by your 2 columns: `df['Address Type'] = pd.Categorical(df['Address Type'],categories = ['Property Address','Alternate Address'],ordered=True)`, then `df.sort_values(['Last Name', 'Address Type'], inplace=True)` – sacuL Aug 08 '18 at 19:21

1 Answers1

0

You can sort by multiple columns. Just put both columns in the list.

duplicates = duplicates.sort_values(['Last Name', 'Address Type'], ascending = True)
RCA
  • 508
  • 4
  • 12
  • Unfortunately, I have already tried this solution and it does not work. – Sterling Aug 08 '18 at 19:12
  • 1
    @SterlingBhollah have you tried `duplicates.sort_values(['Last Name', 'Address Type'], ascending=[True, False])` , as "Address type" is not in ascending. – whoosis Aug 08 '18 at 19:14
  • @SterlingBhollah What happened? – RCA Aug 08 '18 at 19:15
  • @RCA I have updated my question to make it more clear. I apologize for the confusion. However, the Address type will not always be a binary (Property/ Alternate) sometimes it will be many things. For example, I will always have property address, but sometimes I will have alternate contact or contact: Alternate in place of alternate address. I essentially am looking for something to put property address first and anything else that is not property address after in successive rows. – Sterling Aug 08 '18 at 19:27