You can do it this way:
- Split the second column with alphanumeric strings into 2 columns: one column
Letter
to hold the first letter and another column Number
to hold a number of one or two digits.
- Convert
Number
column from string to integer.
- Then, sort these 2 new columns together with the first column of integers
Let's illustrate the process with an example below:
Assume we have the dataframe df
as follows:
print(df)
Col1 Col2
0 2 B12
1 11 C2
2 2 A1
3 11 B2
4 2 B1
5 11 C12
6 2 A12
7 11 C1
8 2 A2
Step 1 & 2: Split Col2
into 2 columns Letter
& Number
+ Convert Number
column from string to integer:
df['Letter'] = df['Col2'].str[0] # take 1st char
df['Number'] = df['Col2'].str[1:].astype(int) # take 2nd char onwards and convert to integer
Result:
print(df)
Col1 Col2 Letter Number
0 2 B12 B 12
1 11 C2 C 2
2 2 A1 A 1
3 11 B2 B 2
4 2 B1 B 1
5 11 C12 C 12
6 2 A12 A 12
7 11 C1 C 1
8 2 A2 A 2
Step 3: Sort Col1
, Letter
and Number
with priority: Col1
---> Number
---> Letter
:
df = df.sort_values(by=['Col1', 'Number', 'Letter'])
Result:
print(df)
Col1 Col2 Letter Number
2 2 A1 A 1
4 2 B1 B 1
8 2 A2 A 2
6 2 A12 A 12
0 2 B12 B 12
7 11 C1 C 1
3 11 B2 B 2
1 11 C2 C 2
5 11 C12 C 12
After sorting, you can remove the Letter
and Number
columns, as follows:
df = df.drop(['Letter', 'Number'], axis=1)
If you want to do all in one step, you can also chain the instructions, as follows:
df = (df.assign(Letter=df['Col2'].str[0],
Number=df['Col2'].str[1:].astype(int))
.sort_values(by=['Col1', 'Number', 'Letter'])
.drop(['Letter', 'Number'], axis=1)
)
Result:
print(df)
Col1 Col2
2 2 A1
4 2 B1
8 2 A2
6 2 A12
0 2 B12
7 11 C1
3 11 B2
1 11 C2
5 11 C12