Arrange columns position in pandas based on specified string arrangement

Question

I have dataframe with similar column names between 'A' and 'B' as below:

df = pd.DataFrame({'A_Text1':1, 'A_Text2':2, 'A_Text3':3, 'B_Text1':4, 'B_Text2':5, 'B_Text3':6, 'A_Text4':7,  'B_Text4':8})

When i use solution from here, Re-ordering columns in pandas dataframe based on column name I will get as such because it arrange alphabetically:

df = pd.DataFrame({'A_Text1':1, 'A_Text2':2, 'A_Text3':3, 'A_Text4':7, 'B_Text1':4, 'B_Text2':5, 'B_Text3':6, 'B_Text4':8})

What I truly need is to be able to arrange the column names by how I want to as below:

df = pd.DataFrame({'A_Text3':3, 'A_Text4':7, 'A_Text1':1, 'A_Text2':2, 'B_Text3':6, 'B_Text4':8, 'B_Text1':4, 'B_Text2':5})

Where the arrangement of column names with "_Text" is always like I specified above (_Text3, _Text4, _Text1, _Text2) because i have hundreds of columns with similar "_Text" name.

jezrael · Accepted Answer · 2020-07-21T10:53:57.490

One idea is use mapping by helper dictionary for custom order with another dictionary for replace in rename, last is used Index.argsort and DataFrame.iloc for changed order by new columns names:

order = {'Text3':'1', 'Text4':'2', 'Text1':'3', 'Text2':'4'}

d = df.columns.to_series().replace(order, regex=True).to_dict()

print (df.rename(columns=d).columns)
Index(['A_3', 'A_4', 'A_1', 'B_3', 'B_4', 'B_1', 'A_2', 'B_2'], dtype='object')

df = df.iloc[:, df.rename(columns=d).columns.argsort()]
print (df)

   A_Text3  A_Text4  A_Text1  A_Text2  B_Text3  B_Text4  B_Text1  B_Text2
0        3        7        1        2        6        8        4        5

You can also create dictionary dynamic, also added zeros for correct sorting if 10+ columns in list:

order = ['Text3', 'Text4', 'Text1', 'Text2']

order_d = {v: f'{k:03}' for k, v in enumerate(order, 1)}
print (order_d)
{'Text3': '001', 'Text4': '002', 'Text1': '003', 'Text2': '004'}

d = df.columns.to_series().replace(order_d, regex=True)
print (df.rename(columns=d).columns)
Index(['A_003', 'A_004', 'A_001',
       'B_003', 'B_004', 'B_001', 'A_002', 'B_002'], dtype='object')

df = df.iloc[:, df.rename(columns=d).columns.argsort()]
print (df)
   A_Text3  A_Text4  A_Text1  A_Text2  B_Text3  B_Text4  B_Text1  B_Text2
0        3        7        1        2        6        8        4        5

score 0 · Answer 2 · answered Jul 21 '20 at 10:37

Lets try pd.Categorical the first idea is to create a dataframe of your columns and extract the alpha and numeric representation

then we apply a categorical sort.

s = df.columns.to_frame('')

s = s.join(s[0].str.extract("(\w)_\D+(\d)").rename(columns={0: "alpha", 1: "numeric"}))

print(s)

         0 alpha numeric
0  A_Text1     A       1
1  A_Text2     A       2
2  A_Text3     A       3
3  B_Text1     B       1
4  B_Text2     B       2
5  B_Text3     B       3
6  A_Text4     A       4
7  B_Text4     B       4

#define your custom order.
order = [3,4,1,2]
s['numeric'] = pd.Categorical(s['numeric'].astype(int),order,ordered=True)
s_ordered = s.sort_values(['alpha','numeric'])


         0 alpha numeric
2  A_Text3     A       3
6  A_Text4     A       4
0  A_Text1     A       1
1  A_Text2     A       2
5  B_Text3     B       3
7  B_Text4     B       4
3  B_Text1     B       1
4  B_Text2     B       2

then assign back to your columns.

df.columns = s_ordered[0].tolist()

  A_Text3  A_Text4  A_Text1  A_Text2  B_Text3  B_Text4  B_Text1  B_Text2
0        1        2        3        4        5        6        7        8

Arrange columns position in pandas based on specified string arrangement

2 Answers2