1

I was quite surprised by some behavior that I saw when assigning an argument to a class attribute in python. Perhaps someone can enlighten me, and help me stop it from happening.

Essentially, the changes I make to the class attribute inside the class methods are being replicated in the global variable that was passed to the class init as an argument.

Is there a built in way to stop this sort of behavior, as in a lot of cases it may break the data variable for other uses down the line.

Here is a basic version of the code

class BasicClass:

    def __init__(self, data_raw):
        self.data = data_raw
        self.data['new_column'] = 1

# Now outside the class

data = pd.read_csv(...)

data.columns
Out[1]: ['orig_column']

obj = BasicClass(data)

data.columns
Out[2]: ['orig_column','new_column']
datavoredan
  • 3,536
  • 9
  • 32
  • 48

1 Answers1

0

This is because both self.data and data are pointing to the same object.

If you want the a deep copy of a list, then

def __init__(self, data_raw):
    self.data = data_raw.copy()
    self.data['new_column'] = 1

Please do refer : How to clone or copy a list?

Community
  • 1
  • 1
Praveen
  • 8,945
  • 4
  • 31
  • 49