0

Assuming a multiindex dataframe as follows:

import pandas as pd
import numpy as np

arrays = [np.array(['John', 'John', 'John', 'Jane', 'Jane', 'Jane']),
          np.array(['New York', 'New York', 'San Francisco', 'New York', 'New York', 'San Francisco']),
          np.array(['2018', '2019', '2018', '2019', '2020', '2020']),
          np.array(['HR', 'Finance', 'HR', 'Finance', 'HR', 'Finance']),
          np.array(['Company A', 'Company A', 'Company A', 'Company B', 'Company B', 'Company B']),
          np.array(['Manager 1', 'Manager 1', 'Manager 2', 'Manager 2', 'Manager 3', 'Manager 3'])]

df = pd.DataFrame(np.random.randn(3, 6), columns=arrays)
df.columns.names = ['userid', 'city', 'year', 'department', 'company', 'line_manager']
display(df)

(The real case is much bigger of course)

I need to change the values of the names in the level called userid based on a function.

Here an example:

def change_name(name):
    if name.starts_with("J"): 
        # only changing the name under certain conditions
        result = "name"+ "00"
    else:
        result = name
    return result

How do I do that? The rest of the indexes values remain the same.

JFerro
  • 3,203
  • 7
  • 35
  • 88

1 Answers1

1

Use DataFrame.rename with level parameter:

#function is simplify
def change_name(name):
    if name.startswith("J"): 
        name += "00"
    return name

df = df.rename(columns=change_name, level='userid')
print (df)
userid          John00                            Jane00            \
city          New York           San Francisco  New York             
year              2018      2019          2018      2019      2020   
department          HR   Finance            HR   Finance        HR   
company      Company A Company A     Company A Company B Company B   
line_manager Manager 1 Manager 1     Manager 2 Manager 2 Manager 3   
0            -2.030992 -1.084850     -0.896033  0.965438  1.569484   
1            -0.858253  0.124797     -0.697320 -1.090280 -0.900256   
2            -1.292128  1.560351      0.365616  0.674399  0.056201   

userid                      
city         San Francisco  
year                  2020  
department         Finance  
company          Company B  
line_manager     Manager 3  
0                 1.877191  
1                 0.732859  
2                 0.565685  

Or is possible use lambda function:

change_name = lambda x: x + "00" if x.startswith("J") else x
df = df.rename(columns=change_name, level='userid')
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252