I am confused by when changes made to global variables inside a function will be retained after the execution of said function, when no global
or nonlocal
statement is present.
import pandas as pd
import numpy as np
d = {'col1': [1, 2], 'col2': [3, 4]}
e = {'col3': [5, 7], 'col4': [8, 9]}
df = pd.DataFrame(data=d)
dfe = pd.DataFrame(data=e)
print("function 1:")
def func1(df_name):
df_name = df_name + 1
df_name.drop(df_name.columns[0], axis=1, inplace=True)
print("inside function:\n", df_name)
func1(df)
print("outside function:\n", df)
print("\n")
print("function 2:")
def func2(df_name, col_name):
df_name['col6'] = df_name[col_name] + 1
print("inside function:\n", df_name)
func2(df, "col1")
print("outside function:\n", df)
print("\n")
print("function 3:")
def func3(df_name1, df_name2):
df_name1 = pd.concat([df_name1, df_name2], axis=1)
print("inside function:\n", df_name1)
func3(df, dfe)
print("outside function:\n", df)
Output:
function 1:
inside function:
col2
0 4
1 5
outside function:
col1 col2
0 1 3
1 2 4
function 2:
inside function:
col1 col2 col6
0 1 3 2
1 2 4 3
outside function:
col1 col2 col6
0 1 3 2
1 2 4 3
function 3:
inside function:
col1 col2 col6 col3 col4
0 1 3 2 5 8
1 2 4 3 7 9
outside function:
col1 col2 col6
0 1 3 2
1 2 4 3
Function 1 in the code shows that adding some value to the dataframe and dropping columns will not be retained, which is expected. Function 2 shows that adding a new column will be retained - this is a surprise to me. Function 3, which I thought was just another way of adding columns, obviously does not keep col3
and col4
. What kind of namespace prison-break is going on in Function 2????? In what other scenarios will I see this phenomenon again? Thanks.
BTW it is intentional that none of these functions has an explicit return as I am trying to understand what is going on. I am aware that it may not be the best practice.