I wonder how I could identify whether I have a view or a copy of another dataframe. Given a pandas.DataFrame
import pandas as pd
df = pd.DataFrame( {'a': [0,8,15], 'b': [42,11,0] } )
as well as a view
df1 = df.loc[ 1:2 ]
and a copy
df2 = df.loc[ 1:2 ].copy()
which results in
>>> df
a b
0 0 42
1 8 11
2 42 0
>>> df1
a b
1 8 11
2 42 0
>>> df2
a b
1 8 11
2 42 0
Assigning values to an existing column results in an warning for df1
>>> df1[ 'a' ] = 42
value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
#!/usr/bin/python
but not for df2. However, the assignment of the value worked in both cases. How can I figure out whether I have a copy or just a view of another DataFrame?
Note that I can not see any difference in the DataFrames df1
and df2
by type
>>> type(df1),type(df2)
by comparing elementwise equality
>>> df1 == df2
a b
1 True True
2 True True
by comparing NDFrame objects
>>> df1.equals
or even by comparing column ordering
>>> from pandas.util.testing import assert_frame_equal
>>> assert_frame_equal(df1, df2)
Note that using the view df1
alterates as well the original DataFrame.
Possible dublicate: Pandas: Subindexing dataframes: Copies vs views
does not answer how I could check this on df1
and df2
.