I'm working with some data where I'm trying to convert an entire column to a different format (ie from object to datetime or from object to numeric) using methods not resetting values. Each line of code below returns the 'SettingwithCopyWarning' error:
#converting euro values column 'value' to numeric values:
df['value'] = pd.to_numeric(df.value, errors='coerce')
#converting object to datetime in order to extract year:
df['date'] = pd.to_datetime(df['date'])
df['date'] = df['date'].dt.year
If I leave any of the above lines in, it causes an error. If I take all of them out, the code doesn't raise any warnings.
After some research, I learned the 'SettingwithCopyWarning' crops up when chained assignments are used, and the view is a copy of the dataframe as opposed to the dataframe itself, (ref: https://www.dataquest.io/blog/settingwithcopywarning/).
I also learned that the general form to avoid chained assignments is df.loc[<mask or index label values>, <optional column>] = < new scalar value or array like>
(ref:python pandas: how to avoid chained assignment).
I tried to wrangle something together like this just to test out the form:
df.loc[df['value']] = pd.to_numeric(df.value, errors='coerce')
but it returns an error like:
KeyError: "['$3.40m' '$3.90m' '$12.60m' '$13.80m' '$123.80m' '$171.20m'\n '$205.2m' '$214.40m' '$221.03m'] not in index"
which is making me think the general form I tried to stuff it in is confusing it for a dictionary and raising a KeyError.
After looking around, I'm not sure how to apply this to entire columns (like my code) that are using methods (dot functions) without using chained assignments.
Is there a way around this?
Edit:
Lines above the given code:
parent_df = pd.DataFrame.from_records(data, columns = ['date', value'])
df = parent_df[parent_df.date.str.contains('.*201[4-9]')]