1

Can someone help me understand what is happening here:

a = 1
b = a
b = 2
print(a)
print(b)

Here, obviously a will be unchanged because assigning 2 to b does not alter a.

In pandas, however:

a = pd.DataFrame({'a':[1,2,3]})
b = a
b.iloc[0,0] = 100
print(a)
print(b)

Now why do both a and b have 100 instead of 1? I just found out I had been overwriting my original variables when I thought I was creating a new object in this way in pandas and had to use b = a.copy() to avoid it.

elbord77
  • 311
  • 1
  • 2
  • 11
  • Does this answer your Question? https://stackoverflow.com/questions/17246693/what-is-the-difference-between-shallow-copy-deepcopy-and-normal-assignment-oper#:~:text=A%20shallow%20copy%20constructs%20a,objects%20found%20in%20the%20original. – Anurag Dabas Mar 27 '21 at 16:19

3 Answers3

2

In the pandas example, b = a does not create a copy of the DataFrame. There is a single DataFrame in memory, and both b and a are references to it. When you change that object, the change is visible in both a and b since they are pointing at the same thing.

If you want to create a copy of the data you could write:

a = pd.DataFrame({'a':[1,2,3]})
b = pd.DataFrame(a, copy=True)

This is true for most objects in Python-- usually a variable name is just pointing to an object in memory, and if you change the object, all references to it will be effected.

The first example is more of a special case. Integers are not mutable (can't be changed) so when you set b = 2 you don't change the 1 into a 2, you just make b point at a different integer, while a is still pointing at 1.

Dustin Michels
  • 2,951
  • 2
  • 19
  • 31
1

Because in your first example you are assigning to a and b two integers (1 and 2), which are two different unmutable objects in memory.

Whereas in your second second example, a and b are assigned the same mutable object (a dataframe). So calling the iloc method on b is the same as calling it on a.

Laurent
  • 12,287
  • 7
  • 21
  • 37
0

Had you used lists instead of integers, the result would have been different:

a = [1]
b = a
print (a,b)
b[0]=2
print (a,b)

Result:

[1] [1]
[2] [2]
RufusVS
  • 4,008
  • 3
  • 29
  • 40