4

I have a dataframe in which (due to something outside my control) the number of variables varies from 1 to 20 and all variables are named 1, 2, 3, 4, 5.... etc

One day there are four columns:

data = {'1': ['A', 'B', 'C', 'D', 'E'], 
        '2': [1, 0, 1, 0, 1], 
        '3': [1, 1, 0, 0, 3],
        '4': [0, 0, 1, 1, 1]}
df = pd.DataFrame(data)
df

And another day there are 2 columns:

data = {'1': ['A', 'B', 'C', 'D', 'E'], 
        '2': [1, 0, 1, 0, 1]}
df = pd.DataFrame(data)
df

What I want:

Prefix every column name with "variable_" (regardless of the number of columns). So it would look like this:

data = {'variable_1': ['A', 'B', 'C', 'D', 'E'], 
        'variable_2': [1, 0, 1, 0, 1], 
        'variable_3': [1, 1, 0, 0, 3],
        'variable_4': [0, 0, 1, 1, 1]}
df = pd.DataFrame(data)
df

I could do it with a loop, but I was hoping there was a simpler way.

Anton
  • 4,765
  • 12
  • 36
  • 50

1 Answers1

11

df.rename can take a function that modify column names, so you can do something like this.

In [171]: data = {'1': ['A', 'B', 'C', 'D', 'E'], 
     ...:         '2': [1, 0, 1, 0, 1], 
     ...:         '3': [1, 1, 0, 0, 3],
     ...:         '4': [0, 0, 1, 1, 1]}
     ...: df = pd.DataFrame(data)
     ...: 

In [172]: df.rename(columns = lambda x : 'variable_' + x)
Out[172]: 
  variable_1  variable_2  variable_3  variable_4
0          A           1           1           0
1          B           0           1           0
2          C           1           0           1
3          D           0           0           1
4          E           1           3           1
chrisb
  • 49,833
  • 8
  • 70
  • 70