1

I have a pandas data frame like this,

   Name     Not_Included  Quantity Not_Included  
0  Auto     DNS           10       DNS
1  NaN      DNS           12       DNS
2  Rtal     DNS           18       DNS
3  NaN      DNS           14       DNS
4  Indl     DNS           16       DNS
5  NaN      DNS           18       DNS

Now, I want to rename Not_Included using data frame's column indexes. So, I get the output like this,

       Name     Not_Included_1  Quantity Not_Included_3  
    0  Auto     DNS             10       DNS
    1  NaN      DNS             12       DNS
    2  Rtal     DNS             18       DNS
    3  NaN      DNS             14       DNS
    4  Indl     DNS             16       DNS
    5  NaN      DNS             18       DNS

I tried the following,

for c,v in enumerate(s_df):
    if v == 'Not_Included':
        vi = 'Not_Included' + str(c)
        s_df.rename(columns=lambda n: n.replace(v, vi), inplace=True)

I get the following result,

    Name    Not_Included31  Quantity  Not_Included31
0   Auto    DNS             10        DNS
1   NaN     DNS             12        DNS
2   Rtal    DNS             18        DNS
3   NaN     DNS             14        DNS
4   Indl    DNS             16        DNS
5   NaN     DNS             18        DNS

There are posts to rename a whole data frame's columns, but that is not what I am looking for since I am automating some tasks. How can I get my desired output using index of columns?

Also, can I do it in list comprehension method in renaming pandas columns?

Any ideas would be great.

user9431057
  • 1,203
  • 1
  • 14
  • 28

2 Answers2

2

Can use np.where to set the columns, checking where it's duplicated.

import numpy as np

df.columns = np.where(df.columns.duplicated(),  
                      [f'{df.columns[i]}_{i}' for i in range(len(df.columns))],
                      df.columns)

Indices also have a where method:

df.columns = df.columns.where(~df.columns.duplicated(),
                              [f'{df.columns[i]}_{i}' for i in range(len(df.columns))])

Output:

   Name Not_Included  Quantity Not_Included_3
0  Auto          DNS        10            DNS
1   NaN          DNS        12            DNS
2  Rtal          DNS        18            DNS
ALollz
  • 57,915
  • 7
  • 66
  • 89
0

This can works too

df.columns = ['{}_{}'.format(coluna, index) if 'Not_Included' in coluna else coluna for index, coluna in enumerate(df.columns)]
Terry
  • 2,761
  • 2
  • 14
  • 28