11

I am trying to clarify for myself Python's rules for 'assigning' values to variables.

Is the following comparison between Python and C++ valid?

  1. In C/C++ the statement int a=7 means, memory is allocated for an integer variable called a (the quantity on the LEFT of the = sign) and only then the value 7 is stored in it.

  2. In Python the statement a=7 means, a nameless integer object with value 7 (the quantity on the RIGHT side of the =) is created first and stored somewhere in memory. Then the name a is bound to this object.

The output of the following C++ and Python programs seem to bear this out, but I would like some feedback whether I am right.

C++ produces different memory locations for a and b while a and b seem to refer to the same location in Python (going by the output of the id() function)

C++ code

#include<iostream>
using namespace std;
int main(void)
{
  int a = 7;
  int b = a; 
  cout << &a <<  "  " << &b << endl; // a and b point to different locations in memory
  return 0;
}

Output: 0x7ffff843ecb8 0x7ffff843ecbc

Python: code

a = 7
b = a
print id(a), ' ' , id(b) # a and b seem to refer to the same location

Output: 23093448 23093448

smilingbuddha
  • 14,334
  • 33
  • 112
  • 189

2 Answers2

12

Yes, you're basically correct. In Python, a variable name can be thought of as a binding to a value. This is one of those "a ha" moments people tend to experience when they truly start to grok (deeply understand) Python.

Assigning to a variable name in Python makes the name bind to a different value from what it currently was bound to (if indeed it was already bound), rather than changing the value it currently binds to:

a = 7   # Create 7, bind a to it.
        #     a -> 7

b = a   # Bind b to the thing a is currently bound to.
        #     a
        #      \
        #       *-> 7
        #      /
        #     b

a = 42  # Create 42, bind a to it, b still bound to 7.
        #     a -> 42
        #     b -> 7

I say "create" but that's not necessarily so - if a value already exists somewhere, it may be re-used.

Where the underlying data is immutable (cannot be changed), that usually makes Python look as if it's behaving identically to the way other languages do (C and C++ come to mind). That's because the 7 (the actual object that the names are bound to) cannot be changed.

But, for mutable data (same as using pointers in C or references in C++), people can sometimes be surprised because they don't realise that the value behind it is shared:

>>> a = [1,2,3]     # a -> [1,2,3]
>>> print(a)
[1, 2, 3]

>>> b = a           # a,b -> [1,2,3]
>>> print(b)
[1, 2, 3]

>>> a[1] = 42       # a,b -> [1,42,3]
>>> print(a) ; print(b)
[1, 42, 3]
[1, 42, 3]

You need to understand that a[1] = 42 is different to a = [1, 42, 3]. The latter is an assignment, which would result in a being re-bound to a different object, and therefore independent of b.

The former is simply changing the mutable data that both a and b are bound to, which is why it affects both.

There are ways to get independent copies of a mutable value, with things such as:

b = a[:]
b = [item for item in a]
b = list(a)

These will work to one level (b = a can be thought of as working to zero levels) meaning if the a list contains other mutable things, those will still be shared between a and b:

>>> a = [1, [2, 3, 4], 5]
>>> b = a[:]
>>> a[0] = 8             # This is independent.
>>> a[1][1] = 9          # This is still shared.
>>> print(a) ; print(b)  # Shared bit will 'leak' between a and b.
[8, [2, 9, 4], 5]
[1, [2, 9, 4], 5]

For a truly independent copy, you can use deepcopy, which will work down to as many levels as needed to separate the two objects.

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
  • 4
    I have mixed feelings about explaining Python names in terms of pointers. Sure, it can be helpful to people who already understand pointers, but it also encourages them to translate Python concepts into their old-school way of thinking rather than embracing the Python Way. So I prefer to say that Python assignment is ultimately a dictionary operation: the object gets placed into a dictionary with the name as the key. And from the language used in the question it appears that smilingbuddha already gets this. :) – PM 2Ring Jan 30 '15 at 03:26
  • @PM2Ring, changed it to use "refers to", hopefully that will make it less likely to be locked to a specific language concept. – paxdiablo Jan 30 '15 at 03:52
  • No worries, pax. But hey, it's your call, I make no claims to be a perfect Python teacher. :) Answers here are supposed to be timeless, but I find it's generally easier to explain this sort of stuff in an interactive way, adjusting the explanation until I find something that clicks with the reader / listener. – PM 2Ring Jan 30 '15 at 03:57
0

In your example code, as "int" is a built-in type in C++, so the operator "=" could not be overloaded, but "=" doesn't always create new object, they could also reference to same object. The python object module is kind of like Java, most of the object is an reference but not a copy.

You can also try this:

a = 7
b = 7
print id(a), ' ' , id(b) 

it output the same result, as python will find both a and b point to same const variable

Cui Heng
  • 1,265
  • 8
  • 10
  • this is an old conversation, but I'd like to point out that as for IMMUTABLE types this answer is valid only in case of small ints and short strings. True, the id's in your example are equal, but not in the case of longer strings and bigger ints. Compare id's of a and b when you assign to them: 10**10, 'Lorem ipsum dolor si amat' or (1,) – fanny Sep 24 '22 at 21:08