0

I have a DataFrame called mydf below, from which I derived another DataFrame which is the tail end of mydf. I was experimenting to find out whether the tail5 DataFrame below actually contains a copy or a view of the underlying mydf DataFrame. But when I update the value of a cell using the .iloc method, it raises the SettingWithCopyWarning. Any idea why? The Pandas manual page on tail did not say anything about the nature of the data returned by the tail method (whether it returns a view [which can be modified with consequence of modifying the original data source] or a copy [which should not be modified for some reason]). Here's the code:

>>> mydf = pandas.DataFrame(numpy.random.rand(10,3), columns=('a','b','c'))

>>> mydf
          a         b         c
0  0.263551  0.175394  0.570277
1  0.032766  0.243175  0.524796
2  0.034853  0.607542  0.568370
3  0.021440  0.685070  0.121700
4  0.253535  0.402529  0.264492
5  0.381109  0.964744  0.362228
6  0.860779  0.670297  0.035365
7  0.872243  0.960212  0.306681
8  0.698318  0.530086  0.469734
9  0.910518  0.697919  0.238539

>>> tail5 = mydf.tail(5)

>>> tail5
          a         b         c
5  0.381109  0.964744  0.362228
6  0.860779  0.670297  0.035365
7  0.872243  0.960212  0.306681
8  0.698318  0.530086  0.469734
9  0.910518  0.697919  0.238539

>>> tail5.loc[9,'a']= 0.8321
/.../lib/python3.7/site-packages/pandas/core/indexing.py:189: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self._setitem_with_indexer(indexer, value)
/.../bin/ipython:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

In my case above, the data in the underlying mydf was modified as a side effect of the .iloc assignment:

>>> mydf
          a         b         c
0  0.263551  0.175394  0.570277
1  0.032766  0.243175  0.524796
2  0.034853  0.607542  0.568370
3  0.021440  0.685070  0.121700
4  0.253535  0.402529  0.264492
5  0.381109  0.964744  0.362228
6  0.860779  0.670297  0.035365
7  0.872243  0.960212  0.306681
8  0.698318  0.530086  0.469734
9  0.832100  0.697919  0.238539
Wirawan Purwanto
  • 3,613
  • 3
  • 28
  • 28

1 Answers1

1

If you use the help function you can see that tail comes from the pandas.core.generic module:

help(pandas.DataFrame.tail)

You then can see that the tail method returns the last 5 rows with iloc, meaning that it returns a subset of your dataframe. That is the reason why you have a warning message

pandas.core.generic By doing:

    mydf = pandas.DataFrame(np.random.rand(10,3), columns=('a','b','c'))
    tail5 = mydf.tail(5).copy()
    tail5.loc[9,'a']= 0.8321

The warning disappears

Jordan Delbar
  • 180
  • 2
  • 9