4

My purpose is to transform date column from object type in dateframe df into datetime type, but suffered a lot from view and copy warning when running the program.

I've found some useful information from link: https://stackoverflow.com/a/25254087/3849539

And tested following three solutions, all of them work as expected, but with different warning messages. Could anyone help explain their differences and point out why still warning message for returning a view versus a copy? Thanks.

Solution 1: df['date'] = df['date'].astype('datetime64')

test.py:85: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy df['date'] = df['date'].astype('datetime64')

Solution 2: df['date'] = pd.to_datetime(df['date'])

~/report/lib/python3.8/site-packages/pandas/core/frame.py:3188: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy self[k1] = value[k2] test.py:85: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

Solution 3: df.loc[:, 'date'] = pd.to_datetime(df.loc[:, 'date'])

~/report/lib/python3.8/site-packages/pandas/core/indexing.py:1676: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy self._setitem_single_column(ilocs[0], value, pi)

x86_64
  • 95
  • 1
  • 10
  • Does this answer your question? [How to deal with SettingWithCopyWarning in Pandas](https://stackoverflow.com/questions/20625582/how-to-deal-with-settingwithcopywarning-in-pandas) – d8aninja Aug 02 '21 at 12:48

2 Answers2

3

Changing how you do the datetime conversion will not fix the SettingWithCopyWarning. You get it because the df you are working with is already a slice of some larger data frame. Pandas is simply warning you that you are working with the slice and not the full data. Try instead to create a new column in df - you'll get the warning, but the column will exist in your slice. It won't in the original data set.

You can turn off these warnings if you know what you are doing by using pd.options.mode.chained_assignment = None # default='warn'

Darina
  • 1,488
  • 8
  • 17
  • Hi Darina, thanks for your answer. Based on your information, If I want to create a new column on original data rather than slice, how should I do it? Thanks. – x86_64 Feb 24 '21 at 01:06
  • Then instead of using `df` use the original dataframe that it came from. By the way, it could be that at some point you do something like `df = df[df.column==condition]` - that also creates a slice (and overwrites the original `df` value). If you my answer helps you, could you please accept it as correct? – Darina Feb 24 '21 at 10:00
1

I got similar warnings recently. After several tries, at least in my case, the problem is not related to your 3 solutions. It might be your 'df'.

If your df was a slice of another pandas df, such as:

df = dfOrigin[slice,:] or
df = dfOrigin[[some columns]] or
df = dfOrigin[one column]

Then, if you do anything on df, that warning will appear. Try using df = dfOrigin[[]].copy() instead.

Code to reproduce this:

import numpy as np
import pandas as pd
np.random.seed(2021)
dfOrigin = pd.DataFrame(np.random.choice(10, (4, 3)), columns=list('ABC'))
print("Orignal dfOrigin")
print(dfOrigin)
#    A  B  C
# 0  4  5  9
# 1  0  6  5
# 2  8  6  6
# 3  6  6  1
df = dfOrigin[['B', 'C']]  # Returns a view
df.loc[:,'B'] = df['B'].astype(str) #Get SettingWithCopyWarning

df2 = dfOrigin[['B', 'C']].copy() #Returns a copy
df2['B'] = df2['B'].astype(str) #OK
Adrian Mole
  • 49,934
  • 160
  • 51
  • 83
Raymond
  • 41
  • 5
  • 1
    As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Jan 05 '22 at 13:56