0

I have a function called ChangeDF

def ChangeDF(df)
    df = df[["Col1","Col2"]]


df = pd.DataFrame([[1, "One", "Hello"], [2, "Two", "Hi"]], columns=["Col1", "Col2", "Col3"])
ChangeDF(df)
print(df)

it suppose to remove one column in datafeame

But I want to change the actual dataframe when call ChangeDF function

When I do now , it just create another instance of df

how can I cange it by reference not by value?

the df should print

1 One
2 Two

not

1 One Hello
2 Two Hi
Marios
  • 26,333
  • 8
  • 32
  • 52
asmgx
  • 7,328
  • 15
  • 82
  • 143

3 Answers3

1

The pandas tends to default to copy-and-modify instead of in-place operations. If following this convention, then

def ChangeDF(df)
    return df[["Col1","Col2"]]

df = pd.DataFrame([[1, "One", "Hello"], [2, "Two", "Hi"]], columns=["Col1", "Col2", "Col3"])
df = ChangeDF(df)

Or, if you really want in-place, then

def ChangeDFInPlace(df)
    to_remove = [x for x in df.columns if x not in ("Col1", "Col2")]
    df.drop(to_remove, axis=1, inplace=True)
Blownhither Ma
  • 1,461
  • 8
  • 18
1

Solution:

Use the drop() method and set inplace=True:

def ChangeDF(df):
    df.drop(["Col3"], axis=1, inplace=True)
    
df = pd.DataFrame([[1, "One", "Hello"], [2, "Two", "Hi"]], columns=["Col1", "Col2", "Col3"])
ChangeDF(df)
print(df)
Marios
  • 26,333
  • 8
  • 32
  • 52
1

You need to change the data frame in global namespace. As you can't assign a function parameter as global variable, you need to define a function inside the ChangeDF function.

import pandas as pd

def ChangeDF(df):
    def change_global():
        global df
        df = df[["Col1","Col2"]]
    change_global()


df = pd.DataFrame([[1, "One", "Hello"], [2, "Two", "Hi"]], columns=["Col1", "Col2", "Col3"])
ChangeDF(df)
print(df)

Output:

        Col1 Col2
    0     1  One
    1     2  Two
ashraful16
  • 2,742
  • 3
  • 11
  • 32