3

How to sort the pandas index alphabetically for the following

idx = pd.Index(['A1 yes', 'A2 no', 'A3 no', 'A10 yes'])

idx.str[:3].to_series().value_counts().sort_index()

A1  1
A10  1
A3 1

How to sort it as A1, A3 and A10 instead of A1, A10 and A3?

Cong Ma
  • 10,692
  • 3
  • 31
  • 47

1 Answers1

3

Use natsorted + reindex:

from natsort import natsorted

s = idx.str[:3].to_series().value_counts()
s = s.reindex(natsorted(s.index))
print (s)
A1     1
A2     1
A3     1
A10    1
dtype: int64

Or extract for sorting by numeric only:

s = s.iloc[s.index.str.extract('(\d+)', expand=False).astype(int).argsort()]
print (s)
A1     1
A2     1
A3     1
A10    1
dtype: int64

And last if want sorted by strings with numeric:

df = (s.index.to_series().str.extract('(?P<a>\D+)(?P<b>\d+)', expand=True)
      .assign(b=lambda x: x['b'].astype(int))
      .sort_values(['a','b']))
print (df)
     a   b
A1   A   1
A2   A   2
A3   A   3
A10  A  10

s = s.reindex(df.index)
print (s)
A1     1
A2     1
A3     1
A10    1
dtype: int64
SethMMorton
  • 45,752
  • 12
  • 65
  • 86
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Do you think this might be a duplicate of https://stackoverflow.com/questions/29580978/naturally-sorting-pandas-dataframe? – SethMMorton Feb 15 '18 at 05:44
  • @SethMMorton - Yes, first part is same like `EdChum's` answer, second and third different. So it seems dupe. – jezrael Feb 15 '18 at 06:06