4

I wonder how I could identify whether I have a view or a copy of another dataframe. Given a pandas.DataFrame

import pandas as pd

df = pd.DataFrame( {'a': [0,8,15], 'b': [42,11,0] } )

as well as a view

df1 = df.loc[ 1:2 ]

and a copy

df2 = df.loc[ 1:2 ].copy()

which results in

>>> df
    a   b
0   0  42
1   8  11
2  42   0
>>> df1
    a   b
1   8  11
2  42   0
>>> df2
    a   b
1   8  11
2  42   0

Assigning values to an existing column results in an warning for df1

>>> df1[ 'a' ] = 42
value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  #!/usr/bin/python

but not for df2. However, the assignment of the value worked in both cases. How can I figure out whether I have a copy or just a view of another DataFrame?

Note that I can not see any difference in the DataFrames df1 and df2 by type

>>> type(df1),type(df2)

by comparing elementwise equality

>>> df1 == df2
      a     b
1  True  True
2  True  True

by comparing NDFrame objects

>>> df1.equals

or even by comparing column ordering

>>> from pandas.util.testing import assert_frame_equal
>>> assert_frame_equal(df1, df2)

Note that using the view df1 alterates as well the original DataFrame.

Possible dublicate: Pandas: Subindexing dataframes: Copies vs views

does not answer how I could check this on df1 and df2.

Brad Solomon
  • 38,521
  • 31
  • 149
  • 235
desiato
  • 1,122
  • 1
  • 9
  • 16
  • possible duplicate of [Pandas: Subindexing dataframes: Copies vs views](http://stackoverflow.com/questions/17960511/pandas-subindexing-dataframes-copies-vs-views) – Alexander Sep 15 '15 at 08:47

1 Answers1

0

You can set the property of the dataframe in advance, with df.is_copy = True or df.is_copy = False, the latter should avoid the warning.

Fabio Lamanna
  • 20,504
  • 24
  • 90
  • 122
  • DataFrame.is_copy is already either a weakref method (`type(df1.is_copy)` is `weakref`) or None (`type(df2.is_copy) is `NoneType `). I would not like to overwrite this. – desiato Sep 15 '15 at 09:13
  • 1
    What is DataFrame.is_copy actually doing? It seems that by calling it returns the original DataFrame. – desiato Sep 15 '15 at 09:21
  • I used to set it as `False` in order to avoid warnings and operate on the current dataframe and not on a copy. – Fabio Lamanna Sep 15 '15 at 09:34