My purpose is to transform date
column from object type in dateframe df
into datetime type, but suffered a lot from view and copy warning when running the program.
I've found some useful information from link: https://stackoverflow.com/a/25254087/3849539
And tested following three solutions, all of them work as expected, but with different warning messages. Could anyone help explain their differences and point out why still warning message for returning a view versus a copy? Thanks.
Solution 1: df['date'] = df['date'].astype('datetime64')
test.py:85: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy df['date'] = df['date'].astype('datetime64')
Solution 2: df['date'] = pd.to_datetime(df['date'])
~/report/lib/python3.8/site-packages/pandas/core/frame.py:3188: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy self[k1] = value[k2] test.py:85: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
Solution 3: df.loc[:, 'date'] = pd.to_datetime(df.loc[:, 'date'])
~/report/lib/python3.8/site-packages/pandas/core/indexing.py:1676: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy self._setitem_single_column(ilocs[0], value, pi)