I have been playing with python and understanding the concept of copying a dataframe through the .copy function as opposed to just reassigning it to a variable.
Let's say we have the following data frame: dfx:
Name Score1 Score2 Score3 Score4
0 Jack 10 Perfect 10 Perfect
1 Jill 10 10 10 Not Finished
2 Jane 20 10 10 5
3 Tom Not Finished 15 10 5
dfx2 = dfx.drop("Score1",axis = 1)
dfx2:
Name Score2 Score3 Score4
0 Jack Perfect 10 Perfect
1 Jill 10 10 Not Finished
2 Jane 10 10 5
3 Tom 15 10 5
running dfx again still returns the original dataframe
Name Score1 Score2 Score3 Score4
0 Jack 10 Perfect 10 Perfect
1 Jill 10 10 10 Not Finished
2 Jane 20 10 10 5
3 Tom Not Finished 15 10 5
Shouldn't the reassignment cause the column "Score1" be dropped from the original dataset as well?
However, running the following:
dfx3 = dfx
dfx3
Name Score1 Score2 Score3 Score4
0 Jack 10 Perfect 10 Perfect
1 Jill 10 10 10 Not Finished
2 Jane 20 10 10 5
3 Tom Not Finished 15 10 5
dfx3.loc[0,"Score4"] = "BAD"
dfx3
Name Score1 Score2 Score3 Score4
0 Jack 10 Perfect 10 BAD
1 Jill 10 10 10 Not Finished
2 Jane 20 10 10 5
3 Tom Not Finished 15 10 5
dfx
Name Score1 Score2 Score3 Score4
0 Jack 10 Perfect 10 BAD
1 Jill 10 10 10 Not Finished
2 Jane 20 10 10 5
3 Tom Not Finished 15 10 5
does cause the original dataset to be modified.
Any explanation why a column drop does not modify the original dataset but an element change does change the original? and seems like any change to a column name in an assigned dataset also modifies the original dataset.