0

I have a bunch of CSV files which are read as dataframes. For each dataframe, I want to change some column names, if a specific column exists in a dataframe:

column_name_update_map = {'aa': 'xx'; 'bb': 'yy'}

In such a map, if 'aa' or 'bb' exists in a dataframe, I want to change the aa to xx, and 'bb' to 'yy'. No values should be changed.

  for file in files:
        print('Current file: ', file)
        df = pd.read_csv(file, sep='\t')
        df = df.replace(np.nan, '', regex=True)
        for index, row in df.iterrows(): 

           pass

I don't think I should use the inner loop, but if I have to do, what's the right way to change the column name only?

2 Answers2

2

You can use rename in dataframes

column_name_update_map = {'aa': 'xx', 'bb': 'yy'}
df = df.rename(columns=column_name_update_map) 
Rajith Thennakoon
  • 3,975
  • 2
  • 14
  • 24
2

To rename specific columns then follow this code.

Code:

import pandas as pd
import numpy as np

#creating sample dataframe 
df=pd.DataFrame({'aa':[1, 2], 'bb':[3, 4], 'c':[5, 6], '':[7, 8]})

#replace columns 'aa' to 'xx', 'bb'  to 'yy' and '' to 'NaN'
df.rename(columns={'aa':'xx', 'bb':'yy', '':np.nan}, inplace=True)
#display resulting dataframe
print(df)

I hope it would be helpful.

Littin Rajan
  • 852
  • 1
  • 10
  • 21