12

I am getting a warning "

 C:\Python27\lib\site-packages\pandas\core\indexing.py:411: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self.obj[item] = s" 

Although as suggested in document I am using df.loc ?

def sentenceInReview(df):
    tokenizer = nltk.data.load('tokenizers/punkt/english.pickle')
    print "size of df: " + str(df.size)
    df.loc[: ,'review_text'] = df.review_text.map(lambda x: tokenizer.tokenize(x))

    print df[:3]
swati saoji
  • 1,987
  • 5
  • 25
  • 35
  • @AndyHayden No it gives me the same warning even on using apply instead of map – swati saoji Apr 27 '15 at 06:23
  • 3
    If you call the function with a newly created dataframe, does it still give the warning? df may already be a 'copy of a slice from a DataFrame' once it enters the method. – firelynx Apr 27 '15 at 08:36
  • Yes thats right , the newly created dataframe does not give me the worning – swati saoji Apr 27 '15 at 14:29
  • Does this answer your question? [Set value for particular cell in pandas DataFrame using index](https://stackoverflow.com/questions/13842088/set-value-for-particular-cell-in-pandas-dataframe-using-index) – Serge Stroobandt Aug 16 '21 at 08:56

3 Answers3

15

I ran into this problem earlier today, this problem is related to the way Python passes 'object references' around between functions/assigning variables etc.

Unlike in say, R, in python assigning an existing dataframe to a new variable doesn't make a copy, so any operations on the 'new' dataframe is still a reference to the original underlying data.

The way to get around this is to make a deep copy (see docs) whenever you're trying to return a copy of something. See:

import pandas as pd
data = [1, 2, 3, 4, 5]
df = pd.DataFrame(data, columns = {'num'})
dfh = df.head(3)  # This assignment doesn't actually make a copy
dfh.loc[:,'num'] = dfh['num'].apply(lambda x: x + 1)
# This will throw you the error

# Use deepcopy function provided in the default package 'copy' 
import copy
df_copy = copy.deepcopy(df.head(3))
df_copy.loc[:,'num'] = df_copy['num'].apply(lambda x: x + 1)
# Making a deep copy breaks the reference to the original df. Hence, no more errors.

Here's a bit more on this topic that might explain the way Python does it better.

Hansang
  • 1,494
  • 16
  • 31
  • 2
    `pd.DataFrame` has its own `copy()` method, so there's no need to import `deepcopy` for this. – zslim May 24 '21 at 20:51
  • @zslim Strangely, I also got the error even after using the df.copy() method. I have not tried deepcopy, but it might still make the difference here. Normally, the df.copy() should be enough, though. – questionto42 Jun 14 '21 at 08:15
  • I got this message even though I used deepcopy. I will try the pd.DataFrame copy() method to see what happens. Assigning a dataframe (or any other variable) in R does not in general make a copy. This is explained in detail in the Advanced R book. However, if the dataframe is modified later then copies may be made. If columns are modified then those columns are copied. If a row is modified then the whole dataframe is modified. For this reason iterating over rows is slow. – Soldalma Sep 06 '21 at 14:27
11

The common reason for the warning message "A value is trying to be set on a copy of a slice from a DataFrame": A slice over another slice! For example:

dfA=dfB['x','y','z']
dfC=dfA['x','z']

""" For the above codes, you may get such a message since dfC is a slice of dfA while dfA is a slice of dfB. Aka, dfC is a slice over another slice dfA and both are linked to dfB. Under such situation, it does not work whether you use .copy() or deepcopy or other similar ways:-( """

Solution:

dfA=dfB['x','y','z']
dfC=dfB['x','z']

Hopefully the above explanation helps:-)

Community
  • 1
  • 1
Peter D WANG
  • 111
  • 1
  • 3
  • 1
    Thanks for the explanation. It helped me understand on a similar problem... Pandas warning weren't so helpful... – Alon Samuel Dec 17 '20 at 07:53
0

Try inserting the values using pd.Series(data,index= index_list)