I am writing a generic dataframe cleansing function as follows
def cleanse_data(df,cols_to_strip):
df.replace({'(?=.*)(\s*\[.*\]\s*)':'','\*':'','\+':'',',.*':'','—':''},inplace=True, regex=True)
df.columns.str.strip()
df[cols_to_strip] = df[cols_to_strip].applymap(lambda x: x.strip())
return df
the second argument takes the list of columns in the dataframe to stip() (i.e. remove its whitespaces) .... calling this function
nhl_df = cleanse_data(nhl_df,['team'])
print(nhl_df[nhl_df['team']=='Jose Sharks']) #doesnt work
print(nhl_df[nhl_df['team'].str.strip()=='Jose Sharks']) #works
so it seems that for some reason the stripping inside the cleansing function didnt work (though the regex replacement worked fine !!) ... any reason for this ??