1

I have this dataframe:

x = pd.read_csv(r'C:\Users\user\Desktop\Dataset.csv', sep = ',')
x['dates'] = pd.to_datetime(x['dates']) #turn column to datetime type
v = x[(x['proj'].str.contains('3'))] ### This part is causing the issue.

v['mnth_yr'] = v['dates'].apply(lambda x: x.strftime('%B-%Y'))     

and it gives this warning:

A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

I know there is a post about it but I can't understand how to solve this specific case. Can you help?

Based on the answer:

x = pd.read_csv(r'C:\Users\user\Desktop\Dataset.csv', sep = ',')
x.loc[:,'dates'] = pd.to_datetime(x['dates']) #turn column to datetime type
v = x[(x['proj'].str.contains('3'))] ###This part is causing the issue.
                            ###And in the next line gives the warning, since it's a copy.
v.loc[:,'mnth_yr'] = v['dates'].apply(lambda x: x.strftime('%B-%Y'))  

It still gives the error is there a way to assign the v without the warning?

  • its a warning not an error..also check this https://stackoverflow.com/questions/20625582/how-to-deal-with-settingwithcopywarning-in-pandas – iamklaus Sep 17 '18 at 13:07
  • As long as you do not care about writing data back to the original DataFrame and know what you are dong then you can just ignore the warning. You can always write back to the original dataframe and then filter. – It_is_Chris Sep 17 '18 at 13:11

1 Answers1

-1

You can always get rid of the warning by using .loc and specifying the column and all rows. For example,

x.loc[:, 'dates'] = pd.to_datetime(x['dates'])
v = x.loc[(x['proj'].str.contains('3')), :] 
...
v.loc[:, 'mnth_yr'] = v['dates'].apply(lambda x: x.strftime('%B-%Y'))

The difference between the two is that in your example, x['dates'] returns a copy of the part of the data frame that meets the condition column == 'dates' (a slice). When you use .loc, it retuns a slice, not a copy. This is generally not a problem, unless you are trying to do nested slicing of the data. In that case, nested slicing without .loc will fail to update the original data frame. See more details here:

https://pandas.pydata.org/pandas-docs/stable/indexing.html#returning-a-view-versus-a-copy

amanbirs
  • 1,078
  • 6
  • 11