0

I have a large spreadsheet file (.xlsx) that I'm processing using python pandas. I noticed there are duplicated headers and I want to rename specific columns without applying to the rest of them.

Jack | SPORT | UNI | SHOP | TOTAL | nan | Li | SPORT | UNI | SHOP | nan |

JULY | 1000  | 200 | 300  | 1500  | NaN |JULY| NaN   | NaN | 1000 | nan | 

The above table is the data that I extracted from an excel file. I want the output to be like this:

Month | Amount | UNI | SHOP | TOTAL | Li |Month | SPORT | TOWN | SHOP |

JULY  | 1000   | 200 | 300  | 1500  | Nan|JULY  | NaN   | NaN  | 1000 |

Questions: 1) Is there a way where I can select the specific column that has similar concept of iloc but for columns? The goal is to rename specific column without interfering other duplicates.

2) How can I drop the last NaN column?

Reuben Khong
  • 161
  • 1
  • 4

1 Answers1

0

You can always set columns' names by using .columns(). Example as follows:

data = {'a': [1,2,3,4], 'b': [3,2,2,1], 'c': [None, 'test', 'hi']}
df = pd.DataFrame(data)

   a  b     c
0  1  3  None
1  2  2  test
2  3  2    hi
3  4  1  None

df.columns = ['C1', 'C2' ,'C3']

   C1  C2    C3
0   1   3  None
1   2   2  test
2   3   2    hi
3   4   1  None

If you want to drop columns, you can use drop().

res = df.drop(columns=['C3'])
   C1  C2
0   1   3
1   2   2
2   3   2
3   4   1
N. Arunoprayoch
  • 922
  • 12
  • 20
  • The thing is there are 2 `nan` and 2 SPORT in the headers. The 6th column header `nan`, I want to rename it as "Li" without changing the last column vice versa with the `df.drop` for the last column Same goes to header "SPORT", I want rename both "SPORT" headers differently – Reuben Khong Aug 26 '19 at 05:51