I am trying to remove non-consecutive duplicated words and numbers from the column names.
E.g. I currently have df['Weeks with more than 60 hours 60'] and I want to get df['Weeks with more than 60 hours']
I tested
df.columns = df.columns.str.split().apply(lambda x:OrderedDict.fromkeys(x).keys()).str.join(' ')
following Python Dataframe: Remove duplicate words in the same cell within a column in Python
But I get the following error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-85-1078b4f07191> in <module>()
31 df_t.columns = df_t.columns.str.replace(r"."," ")
32 df_t.columns = df_t.columns.str.strip()
---> 33 df_t.columns = df_t.columns.str.split().apply(lambda x:OrderedDict.fromkeys(x).keys()).str.join(' ')
34
35 # df_t.columns = df_t.columns.str.replace(r"\(.*\)","")
AttributeError: 'Index' object has no attribute 'apply'
Suggestions?