0

I have a df with these columns:

Index(['Instrument', 'Date', 'Return on Invst Cap', 'Date',
       'Book Value Per Share, Total Equity', 'Date',
       'Earnings Per Share Reported - Actual', 'Date',
       'Revenue from Business Activities - Total', 'Date',
       'Free Cash Flow - Actual', 'Date', 'Total Long Term Debt', 'Date',
       'Profit/(Loss) - Starting Line - Cash Flow'],
      dtype='object')

There are several columns called 'Date', some of these columns have the same values, some don't.

I would like to only keep the first "Date" column and drop the rest. I think one important step is to change the first "Date" to a different name for example to "1 Date" and drop the other "Date" column

But I failed to rename just this column. For example I tried df_big5_simplified= df_big5.rename(columns={1: '1 Date'}) to try to rename by column index position

but the generated df is exactly the same...

I also tried this apparoach:

columns=pd.Index(['Date', 'Instrument', 'Return on Invst Cap',
       'Book Value Per Share, Total Equity',
       'Earnings Per Share Reported - Actual',
       'Revenue from Business Activities - Total', 'Free Cash Flow - Actual',
       'Total Long Term Debt', 'Profit/(Loss) - Starting Line - Cash Flow'], name='item')

df_big5_simplifed=df_big5.reindex(columns=columns)

then I had this error:

ValueError: cannot reindex from a duplicate axis

Any ideas? I could have 50 columns called the same and only want to keep the first one.

neutralname
  • 383
  • 2
  • 4
  • 11
  • Does this answer your question? [python pandas remove duplicate columns](https://stackoverflow.com/questions/14984119/python-pandas-remove-duplicate-columns) – Zaraki Kenpachi Sep 11 '20 at 08:47
  • @ZarakiKenpachi Thanks, I also wish to understand why I can't rename the 2nd column by index... – neutralname Sep 11 '20 at 08:54
  • Rename by location: df.rename(columns={ df.columns[2]: "keep_date" }, inplace = True) – gtomer Sep 11 '20 at 08:56
  • @gtomer Thanks, I tried your code but it has replaced every column called "Date", it's weird, do you know why my line df_big5_simplified= df_big5.rename(columns={1: '1 Date'}) doesn't work? – neutralname Sep 11 '20 at 09:03

1 Answers1

1

You can set all the columns names:

df = df.set_axis(['Instrument', 'Date', 'Return on Invst Cap', 'Date2',
       'Book Value Per Share, Total Equity', 'Date3',
       'Earnings Per Share Reported - Actual', 'Date4',
       'Revenue from Business Activities - Total', 'Date5',
       'Free Cash Flow - Actual', 'Date6', 'Total Long Term Debt', 'Date7',
       'Profit/(Loss) - Starting Line - Cash Flow'], axis=1, inplace=False)
gtomer
  • 5,643
  • 1
  • 10
  • 21