0

Consider some Python 3 code where we have a very large object, and I need to store a reference to this object within a class due to third-party libraries.

In this case, the object df is a very large numpy array (>20 GB) and my system's memory is already almost maxed out. Does assigning df as an instance attribute duplicate the memory required, or simply act as a reference to the existing object?

In code:

import numpy as np
df = np.random.rand(1000, 5)  # Some large numpy array

class MyClass:
    def __init__(self, df):
        self.df = df
    def get_first(self):
        return self.df[0]

instance = MyClass( df )      # Does this copy the object `df`?

I'm hitting some memory issues later on and trying to debug where it might be coming from. My intuition tells me that Python knows to not copy the object, however if we do something like del df then instance.df is still defined.

Adam
  • 276
  • 3
  • 15
  • You should be able to check by comparing object IDs, for example `id(df)` vs `id(instance.df)`. – 101 Aug 12 '22 at 03:16

1 Answers1

0

Does Python make a copy of objects on assignment?

This is because in Python, variables (names) are just references to individual objects. When you assign dict_a = dict_b, you are really copying a memory address (or pointer, if you will) from dict_b to dict_a. There is still one instance of that dictionary.

In your example, df and instance.df is the same thing of dict_a and dict_b, which is only holding the reference of the instance.

How to delete every reference of an object in Python?

No no no. Python has a garbage collector that has very strong territory issues - it won't mess with you creating objects, you don't mess with it deleting objects.

Simply put, it can't be done, and for a good reason.

This is by design and intentional, delete the variable of df, would not eliminate the existence of the instance, it only remove the reference from you df, therefore, as long as instance.df is still holding the instance, you could still access it even you deleted the variable df.

user6346643
  • 603
  • 4
  • 11
  • Please read [answer], and flag duplicate questions as duplicates rather than trying to answer them. To the extent that both links are needed to understand the answer, it is two questions, and should not have been asked like that anyway; for each question, it is adequately answered at the link without any further explanation. – Karl Knechtel Aug 12 '22 at 03:29
  • 1
    Thanks for the answer @user6346643. I have already seen the linked questions prior to asking, but was not sure if Python treats this behavior in classes/objects differently. – Adam Aug 12 '22 at 03:34