While renaming the dataframe, I need to preserve the original names. For e.g.
santandar_data = pd.read_csv(r"train.csv", nrows=40000)
santandar_data.shape
santandar_data.original_names=santandar_data.columns
ndf=santandar_data
ndf.original_names
Index(['ID', 'var3', 'var15', 'imp_ent_var16_ult1', 'imp_op_var39_comer_ult1',
'imp_op_var39_comer_ult3', 'imp_op_var40_comer_ult1',
'imp_op_var40_comer_ult3', 'imp_op_var40_efect_ult1',
'imp_op_var40_efect_ult3',
...
'saldo_medio_var33_hace2', 'saldo_medio_var33_hace3',
'saldo_medio_var33_ult1', 'saldo_medio_var33_ult3',
'saldo_medio_var44_hace2', 'saldo_medio_var44_hace3',
'saldo_medio_var44_ult1', 'saldo_medio_var44_ult3', 'var38', 'TARGET'],
dtype='object', length=371)
The ndf dataframe object has a property original_names that works correctly. But when I use clean_names function, I do not get this functionality.
df=santandar_data.clean_names(case_type="upper", remove_special=True).limit_column_characters(3)
df.original_names
AttributeError: 'DataFrame' object has no attribute 'original_names'
The clean_names function comes from:
https://github.com/ericmjl/pyjanitor/blob/master/janitor/functions.py
What is the best way to change this function to include original column names as a property value?