7

Python variables are for the most part really easy to understand, but there is one case I have been struggling with. If I want to point my variable to a new memory address, how do I do this? Or, if Python does this by default (treating variables like pointers), then how do I literally assign the value from a new variable to the memory address of the old variable?

For example, if I type

a=1
b=2
c=a
c=b

What is the value of c? And what does it point to? Is the statement replacing the pointer c -> a with pointer c -> b or grabbing the value from b and overwriting a with b's value? c=b is ambiguous.

In other words, if you start with this:

a -> 1 <- c
b -> 2

is it re-pointing c like this:

a -> 1    _c
b -> 2 <-/

or copying b like this?

a -> 2 <- c
b -> 2
Ryan
  • 1,486
  • 4
  • 18
  • 28
  • 6
    In python you can reassign variables as you wish. c now refers to the same objects as b refers. Try this `print(id(c) == id(b))` and `print(id(c) == id(a))`. Any similarity between c and a is now erased. See more about name here: https://docs.python.org/3/reference/executionmodel.html. I'm sure however there are more elaborate answers here on SO on the topic. – Anton vBR Oct 07 '18 at 06:20
  • 3
    Ok after seeing your update: name point at objects. they don't hold any values. no copying is involved. – Anton vBR Oct 07 '18 at 06:28
  • 5
    Read the following article by StackOverflow legend, Ned Batchelder, that explains this exhaustively: [Facts and myths about Python names and values](https://nedbatchelder.com/text/names.html) – juanpa.arrivillaga Oct 07 '18 at 06:28
  • @juanpa.arrivillaga good article. Even has a code runtime visualizer. – Ryan Oct 07 '18 at 06:38
  • @juanpa.arrivillaga Exactly the reference we needed. Any1 who has interest in understanding this question should be linked to that page. – Anton vBR Oct 07 '18 at 06:44

8 Answers8

15

There are no pointers to variables in Python. In particular, when you say this:

Is the statement replacing the pointer c -> a with pointer c -> b...

Python does not have any such thing as "the pointer c -> a", so it is not doing that.

...or grabbing the value from b and overwriting a with b's value

but there is no assignment to a, so it's not doing that either.

Instead, Python keeps a symbol table1 that maps each name (a, b, c, etc.) to a pointer to an object. In your code sample, after you assign to a and b, it would look like this (obviously I have made up the memory addresses):

a -> 0xfffa9600 -> 1
b -> 0xfffa9608 -> 2

and then after you assign c = a, it would look like this:

a -> 0xfffa9600 -> 1
b -> 0xfffa9608 -> 2
c -> 0xfffa9600 -> 1

Note that c is entirely independent of a. When you run c = b, it replaces the pointer associated with c in the symbol table with the pointer that was associated with b, but a is not affected:

a -> 0xfffa9600 -> 1
b -> 0xfffa9608 -> 2
c -> 0xfffa9608 -> 2

In this case that's pretty much all there is to it because the objects in question, namely the integer constants 1 and 2, are immutable. However, if you use mutable objects, they do start to act a bit more like pointers in the sense that changes to the object when it's stored in one variable are reflected in other variables that refer to the same object. For example, consider this sample of code:

x = {'a': 1, 'b': 2}
y = x

Here, the symbol table might look something like this:

x -> 0xffdc1040 -> {'a': 1, 'b': 2}
y -> 0xffdc1040 -> {'a': 1, 'b': 2}

If you now run

y['b'] = y['a']

then it doesn't actually change the pointer associated with y in the symbol table, but it does change the object pointed to by that pointer, so you wind up with

x -> 0xffdc1040 -> {'a': 1, 'b': 1}
y -> 0xffdc1040 -> {'a': 1, 'b': 1}

and you'll see that your assignment to y['b'] has affected x as well. Contrast this with

y = {'a': 1, 'b': 2}

which actually makes y point at an entirely different object, and is more akin to what you were doing before with a, b, and c.


1Actually there are several symbol tables, corresponding to different scopes, and Python has an order in which it checks them, but that detail isn't particularly relevant here.

David Z
  • 128,184
  • 27
  • 255
  • 279
  • Nice! Are these memory addresses equal to the ids of the objects? – Anton vBR Oct 07 '18 at 06:47
  • Very in depth. Thank you. As someone else pointed out I was not clear in my explanation. When I wrote `c -> a`, in my mind I was resolving `a` to a physical location in memory storing the value 1. It seems that Python has an added layer of resolution. My background in C++ causes confusion for me in Python. – Ryan Oct 07 '18 at 06:48
  • 2
    @AntonvBR If by "ids" you mean the values returned from [the `id()` builtin function](https://docs.python.org/3/library/functions.html#id): they might be. In the standard Python interpreter (CPython), at least currently, `id()` does give the memory address. Other Python interpreters (or future versions of CPython) might do it differently. – David Z Oct 07 '18 at 06:49
  • @Ryan You're welcome, glad I could help. This is a common source of confusion for people coming to Python from lower level languages so you'll find a lot written about it on Stack Overflow and elsewhere. – David Z Oct 07 '18 at 06:52
7

c doesn't "Point at a or b"... it points at the 1 or 2 objects.

>>> a = 1
>>> b = 2
>>> c = a
>>> c
1
>>> c = b
>>> c
2
>>> b = 3
>>> c
2

This can be proven somewhat with id() - b and c point at the same "thing":

>>> b = 2
>>> c = b
>>> id(b)
42766656
>>> id(c)
42766656
Attie
  • 6,690
  • 2
  • 24
  • 34
  • 1
    Your answer is a good balance between in depth and easy to understand. It is critical for C++ developers like me to realize that the value is an object separate from the label `c`. `c` IS a memory location that contains a pointer to another memory location that is called `1`, but in C++, `c` IS a memory location that contains data that is `1`. – Ryan Oct 07 '18 at 07:16
2

Answering both of your question at once What is the value of c? What does c point to?, I've added an step by step execution with the id() of each variable with proper comment. Hope this helps you understand properly what is happening under the hood.

>>> a=1
>>> b=2
>>> print(id(a))
1574071312    # this is the address of a
>>> print(id(b))
1574071344    # this is the address of b
>>>c=a        # assignment of a to c
>>> print(c)
1             # c will contain now the value of a
>>> print(id(c))
1574071312    # this is the address of c which is same as a
>>> c=b       # re-assignment of b to c
>>> print(c)
2             # c wil contain now the value of b  
>>> print(id(c))
1574071344    # this the address of c now which is same as b
A l w a y s S u n n y
  • 36,497
  • 8
  • 60
  • 103
1

Well my friend, in this example if c is pointing to a, it'll appear like they are pointing at the same value but not, for example if you're pointing

a = 2
c = a

then after this declaration, if you change the value of a = 3, c will change it's value to 3 also.

Imagine those variables values [2][3] in boxes and the variables a, b, c are just pointing to those boxes.

If one variable a is pointing to one box, and the other variable c is pointing to the variable a that is pointing to the box [2], the last variable c is just following the first variable a, not the value [2], hope this explanation gets you happy.

Ryan
  • 1,486
  • 4
  • 18
  • 28
1

So to summarize a few of the really good answers I saw from others,

  1. Values are objects at a memory location without a name.
  2. Variables (variable names/labels) have no intrinsic value. They are separate objects with their own space in memory, and they can point to any value objects.
  3. The Assignment operator points a label object to a value object.

Let's inaccurately go step by step through the assignment operation from the point of view of the Python interpreter:

  1. First, we create a value.

    [value obj]
    

    Note: [ ] denotes a physical memory location. This means the value has its own unique memory address.

  2. Next, we create a label.

    [Label obj] -> nothing
    
  3. Last, we assign the label to its value.

    [Label obj] -> [value obj]     
    

So,

a = 1

is the same as

[memorylocation containing "a"] -> [memorylocation containing 1]

and

c = b

is same as

[memorylocation containing "c"]  ->  "b" resolved to [memorylocation containing 2]
Ryan
  • 1,486
  • 4
  • 18
  • 28
0

Basically, in the fourth line c variable is being overwritten by the value of b. As this is the last statement, c will hold the value of 2.

  • How do I explicitly force b's value to overwrite/copy to a's value? – Ryan Oct 07 '18 at 06:26
  • 2
    @Ryan in Python, you do not. Variables act like name-tags to objects, they are not memory addresses. Python doesn't have pointers. – juanpa.arrivillaga Oct 07 '18 at 06:31
  • This explanation is ambiguous, because it relies on me already understanding that the "c variable" is its own memory location pointing to another memory location. – Ryan Oct 07 '18 at 07:05
0

Well, in your code:

a=1
b=2
c=a
c=b

Before you assign c to b's value and after you assign c to a's value, c will be a.

And after, at the end of the code, c will be b, because you're reassigning the variable.

The second assignment, basically creates new variable, without knowing that the variable is there already, so it will just do it but no way of accessing previous holding-value of the varaible

Ryan
  • 1,486
  • 4
  • 18
  • 28
U13-Forward
  • 69,221
  • 14
  • 89
  • 114
  • Thanks. I like your answer, because it shows how different Python's execution model is from C++. In your explanation, when you say "assign `c` to `b`", a C++ developer would look at this and say it is backwards. They would "assign `b` to `c`", because the assignment operator is a very low-level operation with an implied flow/direction for the data (retrieve value of `b`, malloc to memory heap, and dump the CPU's register into the memory location at `c`). – Ryan Oct 07 '18 at 06:56
  • From now on, this is how I will be thinking about it. – Ryan Oct 07 '18 at 07:21
  • 5
    I actually think this answer is a little misleading, since it says things like "`c` will be `a`", but it's really not true that `c` is `a`; instead, `c` "is" the same value that `a` also happens to be. There's no relation between `c` and `a`, except that (at a particular point in the program's execution) they happen to both refer to the same value. @Ryan, if this explanation really helps you understand how Python actually works, that's fine, but I think it has the potential to confuse other people in the same way you were confused when you asked the question. – David Z Oct 07 '18 at 07:54
  • 1
    @Ryan in C terms, I suppose, you could say that in Python, you can never really refer to a value of an object directly, just indirectly through a pointer which you cannot dereference. You can merely swap pointers around on a symbol table. – juanpa.arrivillaga Oct 07 '18 at 08:25
  • @DavidZ This wording is specifically perfect for helping a low level language programmer understand Python's confusing execution model. The inaccuracy you see in his wording is actually helpful for tipping the C programmer brain off that this is not the same. It kicks the C programmer brain out of C mode while still wording it in a way that makes sense to a C programmer brain, because the C programmer brain will automatically resolve `a` and `b` to their actual memory locations. It is only misleading to a non-C programmer haha. – Ryan Oct 07 '18 at 08:38
  • 1
    @Ryan I disagree. To be clear, I understand the way you see it as you explained in your comment, and _some_ low-level programmers will also understand the answer that way, and for them it will be helpful, but I think other people (specifically other low-level programmers) will misinterpret it, and for those people this way of explaining it will be harmful. I'm skeptical that this is only misleading to non-C programmers, as you claim. – David Z Oct 07 '18 at 08:47
0

What you've encountered is reference duplication in Python. To quote from copy module documentation:

Assignment statements in Python do not copy objects, they create bindings between a target and an object

You can observe how that works in practice if you think in terms of objects and their values, and use is operator and id() built-in function:

>>> a=1
>>> b=2
>>> c=a
>>> a is c
True
>>> id(a), id(c)
(10932288, 10932288)
>>> id(a), id(c)

Among other things you can verify exactly the same via references count:

>>> import sys
>>> a=1
>>> b=2
>>> sys.getrefcount(a)
803
>>> sys.getrefcount(b)
97
>>> c=a
>>> sys.getrefcount(c)
804
>>> sys.getrefcount(a)
804
>>> c=b
>>> sys.getrefcount(a)
803
>>> sys.getrefcount(b)
98
>>> 

Tangentially, this is related to deep and shallow copying. Again from copy documentation:

The difference between shallow and deep copying is only relevant for compound objects (objects that contain other objects, like lists or class instances):

  • A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original.
  • A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.

Your example uses simple variables and they will always default to reference duplication - no new objects no matter if you try to deep copy them:

>>> import copy
>>> id(b),id(c)
(10932320, 10932320)
>>> c = copy.deepcopy(b)
>>> id(b),id(c)
(10932320, 10932320)

However, if you try to assign tuples or lists, the story is different:

>>> a = [1,2,3]
>>> b = [3,2,1]
>>> c = a
>>> id(a),id(c)
(139967175260872, 139967175260872)
>>> c = copy.deepcopy(a)
>>> id(a),id(c)
(139967175260872, 139967175315656)

In the above example, you get an entirely different object. Why this might be useful ? The fact that simple assignment just makes two variables reference same object also implies that if you change one , changes get reflected in the other.

>>> id(c),id(a)
(139967175260872, 139967175260872)
>>> a.append(25)
>>> id(c),id(a)
(139967175260872, 139967175260872)
>>> c
[1, 2, 3, 25]
>>> 

This can be impractical when you want to keep original data. When you want to have two same objects initially, but then let them change in their own way - that's where you want to have either shallow copy for just object itself or deep copy for all objects that are contained within the object:

>>> c = copy.deepcopy(a)
>>> a.append(35)
>>> a
[1, 2, 3, 25, 35]
>>> c
[1, 2, 3, 25]

And just for demo purposes, shallow copy:

>>> c = a
>>> a.append([9,8,7])
>>> a
[1, 2, 3, 25, 35, [9, 8, 7]]
>>> c = a
>>> id(a), id(c), id(a[-1])
(139967175260872, 139967175260872, 139967175315656)
>>> c = copy.copy(a)
>>> id(a), id(c), id(a[-1])
(139967175260872, 139967175315528, 139967175315656)

See also grc's excellent answer on the same topic with better examples.

Community
  • 1
  • 1
Sergiy Kolodyazhnyy
  • 938
  • 1
  • 13
  • 41