5

I have the following data:

Set Coolthing Route Organ Up Down
set4 Pam3CSK4 ID LL 81 60
set4 Poly_IC ID LL 542 92
set4 Poly_IC ID MM 73 73
set4 cdiGMP ID MM 143 78
set4 Poly_IC ID BB 32 82
set4 cdiGMP ID BB 90 129

With the following code:

import pandas as pd
df = pd.io.parsers.read_table("http://dpaste.com/2PHS7R0.txt",sep=" ")
df = df.pivot(index="Coolthing",columns="Organ").fillna(0)
df.drop('Set',axis=1,inplace=True)
df.drop('Route',axis=1,inplace=True)
df.index.name = None
df.columns.names = (None,None)

I get the following:

In [75]: df
Out[75]:
          Up            Down
          BB   LL   MM    BB  LL  MM
Pam3CSK4   0   81    0     0  60   0
Poly_IC   32  542   73    82  92  73
cdiGMP    90    0  143   129   0  78

What I want to do is to sort the row with case insensitive way yielding this:

          Up            Down
          BB   LL   MM    BB  LL  MM
cdiGMP    90    0  143   129   0  78
Pam3CSK4   0   81    0     0  60   0
Poly_IC   32  542   73    82  92  73

How can I achieve that?

pdubois
  • 7,640
  • 21
  • 70
  • 99

3 Answers3

7

Building on @Marius case_insensitive_order, a single liner using reindex

In [63]: df.reindex(sorted(df.index, key=lambda x: x.lower()))
Out[63]:
          Up            Down
          BB   LL   MM    BB  LL  MM
cdiGMP    90    0  143   129   0  78
Pam3CSK4   0   81    0     0  60   0
Poly_IC   32  542   73    82  92  73
Zero
  • 74,117
  • 18
  • 147
  • 154
  • Thanks for this answer, Zero. I've extended it to work with multiple columns https://stackoverflow.com/a/46358081/1072869 – Aralox Sep 22 '17 at 06:20
3

You can force this by using the new CategoricalIndex (new in 0.16.1, I think), but I'm not sure if it's a great idea as it may have unpredictable effects:

case_insenstive_order = sorted(df.index, key=lambda x: x.lower())
case_insenstive_order
Out[4]: ['cdiGMP', 'Pam3CSK4', 'Poly_IC']

df.index = pd.CategoricalIndex(df.index, 
                               categories=case_insenstive_order, 
                               ordered=True)
df.sort_index()
Out[7]: 
          Up           Down        
          BB   LL   MM   BB  LL  MM
cdiGMP    90    0  143  129   0  78
Pam3CSK4   0   81    0    0  60   0
Poly_IC   32  542   73   82  92  73
Marius
  • 58,213
  • 16
  • 107
  • 105
2

I consider this a valid answer too:

df = df.iloc[df.index.str.lower().argsort()]

However, reindex certainly works a bit faster:

%timeit df.reindex(sorted(df.index, key=lambda x: x.lower()), copy=True)
1000 loops, best of 3: 794 µs per loop

%timeit df.iloc[df.index.str.lower().argsort()]
1000 loops, best of 3: 850 µs per loop

I tested here with pandas 0.20.3 and python2 on a table that had 500 rows and about 50 columns.

normanius
  • 8,629
  • 7
  • 53
  • 83