1

I have a class containing a pandas.DataFrame and a method that returns subset of it's columns. I want to get a view, not a copy.

class Data:
    def __init__(self, path, selected_features):
        df = pandas.read_excel(path)
        self._features = df[selected_features].astype("float64") # df[selected_features] is only int64  and float64

    def features(self):
        return self._features.iloc[, 0:2] # for simplicity let's just return 2 columns

Using np.shares_memory()I've established that it returns a copy.

data = Data(path, selected_features)
print(np.shares_memory(data._features, data.features()))
# False
print(np.shares_memory(data._features, data._features.iloc[:,0:2]))
# False

I've tried using .loc and it yields the same result.

Why is it returning a copy and how can I make it return a view.

Note: I've seen the docs and thread 1, thread 2 and thread 3 - none helped me resolve the issue.

1 Answers1

1

There are some errors in your code:

  • data._features does not exist in your code, maybe data.labels?
  • labels.iloc[0:2] does not return 2 columns but 2 rows.

Why does .loc/.iloc return a copy

Or not?

>>> df = pd.DataFrame(np.random.random((10, 4)), columns=list('ABCD')).copy()

>>> df._is_view
False

>>> df._is_copy
None

>>> hex(id(df))
'0x7fd3c7a5ba60'

>>> df.iloc[:, 0:2]._is_view
True

>>> df.iloc[:, 0:2]._is_copy
<weakref at 0x7fd43731c540; to 'DataFrame' at 0x7fd3c7a5ba60>

>>> np.shares_memory(df, df.iloc[0:2])
True

>>> np.shares_memory(df, df.iloc[0:2].copy())
False

Note: copy() is important to avoid SettingWithCopyWarning because Pandas keeps a reference to the source DataFrame.

Corralien
  • 109,409
  • 8
  • 28
  • 52
  • Sorry, the class more complex and I tried to leave out the unnecessary parts and made mistakes along the way, which are now corrected. Thanks to Your answer I found a mistake in my code - it was in `__init__()`, i added a `.copy()` to line assigning to `self._features` like in Your code and now it works as intended. – Artur Stopa Apr 25 '23 at 18:26
  • I don't understand why copying makes a difference though, could You please explain? Please edit Your answer and I will mark it as the solution. – Artur Stopa Apr 25 '23 at 18:44
  • I updated my answer. I added a link to an interesting Q&A about `copy`. – Corralien Apr 25 '23 at 19:25