10

Okay a very silly question I'm sure. But how does python assign value to variables?

Say there is a variable a and is assigned the value a=2. So python assigns a memory location to the variable and a now points to the memory location that contains the value 2. Now, if I assign a variable b=a the variable b also points to the same location as variable a.

Now. If I assign a variable c=2 it still points to the same memory location as a instead of pointing to a new memory location. So, how does python work? Does it check first check all the previously assigned variables to check if any of them share the same values and then assign it the memory location?

Also, it doesn't work the same way with lists. If I assign a=[2,3] and then b=[2,3] and check their memory locations with the id function, I get two different memory locations.But c=b gives me the same location. Can someone explain the proper working and reason for this?

edit :-

Basically my question is because I've just started learning about the is operator and apparently it holds True only if they are pointing to the same location. So, if a=1000 and b=1000 a is b is False but, a="world" b="world" it holds true.

qwertp
  • 839
  • 3
  • 11
  • 16
  • [Here](http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html#python-has-names) is a good explanation for some of this behavior -- unfortunately, I don't know what happens on a deeper level than that. (credit: @Sophologist) – inspectorG4dget Dec 30 '15 at 00:11
  • 3
    Worth a read: http://nedbatchelder.com/text/names.html – jonrsharpe Dec 30 '15 at 00:11
  • id(2) and id(c) with c=2 returns the same values, while lists and objects have their separate memory addresses, which can be explained by pass by reference and pass by value (copy). – stian Dec 30 '15 at 00:14
  • 2
    This will explain the ints and string caching http://stackoverflow.com/questions/28329498/why-does-a-space-effect-the-identity-comparison-of-equal-strings/28329522#28329522, the caching is a cpython implementation detail, there are other peephole optimizations that python does which can also mean two objects point to the same memory location. In general `a = b` is always going to give you a reference to b so `a is b`. The int caching and string interning is just an implementation detail – Padraic Cunningham Dec 30 '15 at 00:35
  • And yes `is` checks the identity so if a is b they both point to the same object, there are exceptions where the id can be different but the objects are actually the same http://stackoverflow.com/questions/29777368/python-arrays-are-automatically-copying-each-other/29777450#29777450 – Padraic Cunningham Dec 30 '15 at 00:51
  • 2
    This shouldn't be a duplicate as it stands, simply given that the titles seem totally unrelated. The title of the duplicate should be changed to mark this as a duplicate. – Nir Friedman Dec 30 '15 at 01:26

3 Answers3

11

I've faced this problem before and understand that it gets confusing. There are two concepts here:

  1. some data structures are mutable, while others are not
  2. Python works off pointers... most of the time

So let's consider the case of a list (you accidentally stumbled on interning and peephole optimizations when you used ints - I'll get to that later)

So let's create two identical lists (remember lists are mutable)

In [42]: a = [1,2]

In [43]: b = [1,2]

In [44]: id(a) == id(b)
Out[44]: False

In [45]: a is b
Out[45]: False

See, despite the fact that the lists are identical, a and b are different memory locations. Now, this is because python computes [1,2], assigns it to a memory location, and then calls that location a (or b). It would take quite a long time for python to check every allocated memory location to see if [1,2] already exists, to assign b to the same memory location as a.
And that's not to mention that lists are mutable, i.e. you can do the following:

In [46]: a = [1,2]

In [47]: id(a)
Out[47]: 4421968008

In [48]: a.append(3)

In [49]: a
Out[49]: [1, 2, 3]

In [50]: id(a)
Out[50]: 4421968008

See that? The value that a holds has changed, but the memory location has not. Now, what if a bunch of other variable names were assigned to the same memory location?! they would be changed as well, which would be a flaw with the language. In order to fix this, python would have to copy over the entire list into a new memory location, just because I wanted to change the value of a

This is true even of empty lists:

In [51]: a = []

In [52]: b = []

In [53]: a is b
Out[53]: False

In [54]: id(a) == id(b)
Out[54]: False

Now, let's talk about that stuff I said about pointers:

Let's say you want two variables to actually talk about the same memory location. Then, you could assign your second variable to your first:

In [55]: a = [1,2,3,4]

In [56]: b = a

In [57]: id(a) == id(b)
Out[57]: True

In [58]: a is b
Out[58]: True

In [59]: a[0]
Out[59]: 1

In [60]: b[0]
Out[60]: 1

In [61]: a
Out[61]: [1, 2, 3, 4]

In [62]: b
Out[62]: [1, 2, 3, 4]

In [63]: a.append(5)

In [64]: a
Out[64]: [1, 2, 3, 4, 5]

In [65]: b
Out[65]: [1, 2, 3, 4, 5]

In [66]: a is b
Out[66]: True

In [67]: id(a) == id(b)
Out[67]: True

In [68]: b.append(6)

In [69]: a
Out[69]: [1, 2, 3, 4, 5, 6]

In [70]: b
Out[70]: [1, 2, 3, 4, 5, 6]

In [71]: a is b
Out[71]: True

In [72]: id(a) == id(b)
Out[72]: True

Look what happened there! a and b are both assigned to the same memory location. Therefore, any changes you make to one, will be reflected on the other.

Lastly, let's talk briefly about that peephole stuff I mentioned before. Python tries to save space. So, it loads a few small things into memory when it starts up (small integers, for example). As a result, when you assign a variable to a small integer (like 5), python doesn't have to compute 5 before assigning the value to a memory location, and assigning a variable name to it (unlike it did in the case of your lists). Since it already knows what 5 is, and has it stashed away in some memory location, all it does is assign that memory location a variable name. However, for much larger integers, this is no longer the case:

In [73]: a = 5

In [74]: b = 5

In [75]: id(a) == id(b)
Out[75]: True

In [76]: a is b
Out[76]: True

In [77]: a = 1000000000

In [78]: b = 1000000000

In [79]: id(a) == id(b)
Out[79]: False

In [80]: a is b
Out[80]: False
inspectorG4dget
  • 110,290
  • 27
  • 149
  • 241
  • Ok let's see if I got it right. Because lists are mutable, it would take python a lot of time to assign different locations to lists previously having the same values and if one of them is changed. But what about strings and integers? They are both immutable and and each assignment should be in a different location like with integers. However same strings get the same location. Please check the edited part of the question. – qwertp Dec 30 '15 at 00:54
  • @qwertp: I used lists to demonstrate how that would be difficult. The principle carries through all data structures. Remember what I said about the peephole optimizations (how python loads a few things right off the bat)? That also applies to small strings, small integers (which I've demonstrated in my post), and some other data structures as well. – inspectorG4dget Dec 30 '15 at 00:58
  • 1
    @qwertp, the strings are the same object because all strings that are made up of just letters and or numbers are interned and reused, this is covered explicitly in the link I posted in the comments. If you used `a="$hello"` `b="$hello"` `a is b` would be False. This is not something to rely on. – Padraic Cunningham Dec 30 '15 at 01:02
  • 1
    Your statement: "python works off pointers most of the time" isn't really correct. Python works off pointers all the time, it's just that for certain things (small integers, None), it implements their values as singletons so new objects never need to be created. Or, to elaborate further: it doesn't really give the memory address at which 2 is stored the name a, rather it creates a pointer a that points to the memory address where 2 is stored. a is still a pointer. – Nir Friedman Dec 30 '15 at 01:04
  • @NirFriedman: you're absolutely right. I was unable to word that thought quite as eloquently. Please edit my answer to include your comment – inspectorG4dget Dec 30 '15 at 01:06
0

This is an optimization that python performs for small integers. In general, you can't count on a and c pointing to the same location. If you try this experiment with progressively larger integers you'll see that it stops working at some point. I'm pretty sure 1000 is large enough but I'm not near a computer; I thought I remembered it being all integers from -128 to 127 are handled this way (or some other "round number").

Nir Friedman
  • 17,108
  • 2
  • 44
  • 72
0

Your understanding is generally correct, but it's worth noting that python lists are totally different animals compared to arrays in C or C++. From the documentation:

id(obj) Return the “identity” of an object. This is an integer (or long integer) which is guaranteed to be unique and constant for this object during its lifetime. Two objects with non-overlapping lifetimes may have the same id() value.

The simple answer to your question is that lists in python are actually references. This results in their memory addresses being different as the address is that of the reference as opposed to the object as one might expect.

Untitled123
  • 1,317
  • 7
  • 20