1
import pandas as pd

class MyClass():
    def __init__(self, df):
        self.df = df

    def edit(self):
        self.df = self.df[~self.df['numbers'].isin([2,4,6,8,10])]

df = pd.DataFrame([1,2,3,4,5,6,7,8,9,10], columns=['numbers'])
obj = MyClass(df).edit()
print df

I am expecting print df to print the dataframe which has been reassigned after the filteration function.

But this stil prints the dataframe which is before the edit function modify it.

How can my outer variable still track the changes happend to the df inside class

Raheel
  • 8,716
  • 9
  • 60
  • 102
  • Strings are immutable. `var` will not change until you reassign the name. – timgeb Nov 07 '17 at 09:11
  • ok in actual it is a `pandas datafare`. should i change it to list to keep it simple ? – Raheel Nov 07 '17 at 09:12
  • I don't know because I don't see the point of your class at all. If you want to mutate a mutable object, why not use the methods it exposes directly? – timgeb Nov 07 '17 at 09:13
  • 1
    Also, it is considered an anti-pattern to globally modify arguments of functions inside the function, let alone arguments used as "initializes". Imagine trying to debug a program with several "`var`" variables and `MyClass` instances. – DeepSpace Nov 07 '17 at 09:14
  • 1
    I am updating the actual problem which will make more sense. – Raheel Nov 07 '17 at 09:16
  • FYI your code would "work" if you made `var = [1,2,3]` and changed the body of `edit` to `self.var.pop()`, for example. – timgeb Nov 07 '17 at 09:22
  • @timgeb that was the simulation. Can you please see the question now. It is not list but pandas dataframe. – Raheel Nov 07 '17 at 09:24
  • You have to mutate the variable. *assignment is not mutation*. – juanpa.arrivillaga Nov 07 '17 at 09:41

3 Answers3

1

How can my outer variable still track the changes happend to the df inside class

The way you are trying to do this, namely by reassignment, it's impossible. Names do not "see" the reassignment of other names (imagine the catastrophe if they did).

Your only chance without reassigning df here is to mutate the dataframe. All you currently do is to create a new object and reassign self.var. The var outside of your class does not care about that and still points to the old object.

Of course the sensible thing to do would be to just reassign df to the return value of some function or method, i.e.:

def compute_new_df_from_old_df(df):
    return df.foo()[bar]

df = <some dataframe>
df = compute_new_df_from_old_df(df)
timgeb
  • 76,762
  • 20
  • 123
  • 145
  • Actually the `var` outside is responsbile to call other stuff such as `var.to_csv()` so that is why i needed the var to be in final state – Raheel Nov 07 '17 at 09:30
1

Try rewriting your code as:

df = pd.DataFrame([1,2,3,4,5,6,7,8,9,10], columns=['numbers'])
obj = MyClass(df)
obj.edit()
print obj.df
sureshvv
  • 4,234
  • 1
  • 26
  • 32
0

Immutable objects aren't passed by reference (and technically, neither are mutable objects). Is there any specific reason you want to do this?

Daniel
  • 769
  • 8
  • 21
  • Sorry, I was trying ti simulate the problem i was having. I have updated the question now can you please check – Raheel Nov 07 '17 at 09:23