1

I have read a number of posts over the years that explained Python variables by "Python has names, other languages have variables" or by "objects and tags and boxes"

I have always disliked these kind of explanations, since they required new mental images and, in my opinion, try to redefine common terminology. (Arguably Python has variables too, and other languages have names too.)

Since I only ever did proper programming with Python, its behavior was the only one I knew, so these discussions didn't really affect me much.

Nonetheless, I would like to know whether my following understanding of this topic is correct, before I defend it in discussions with others:

All Python variables are pointers

While writing this question, I noticed that a similar question has been asked before, and everybody in that thread, including the OP, concluded that Python variables are not pointers (and instead, the answers used the explanations above instead), since

i = 5
j = i
j = 3
print(i)

results in 5. However, to me this result makes perfect sense. In my understanding it is translated to

void *i;
void *j;

int temp1 = 5; %RHS of line 1 above
i = &temp1; %LHS of line 1 above

j = &i; %Line 2 above

int temp2 = 3; %RHS of line 3 above
j = &temp2; %LHS of line 3 above

(To be precise, there is one answer to the previously cited question that does provide more or less this explanation, but has a negative score and no comments as to what is wrong with it.)

As far as I understand, this model also perfectly explains parameters in Python, e.g, it explains why

def foo(b):
   b=2
a=1
foo(a)
print(a)

results in 1. Within foo, the local pointer variable b gets redirected to the address of a new integer object. The outer pointer variable a still points to the integer object with value 1.

To get a complete picture, and make sense of property access and why it can be used to change a variable that is passed to a function after all, I would then add the claim:

Property access with .in Python translates to -> in C++

Since indexing is just syntactic sugar, it is also covered by this rule.

In summary, I believe two simple rules explain everything about variables in Python. This is a bold statement, and I am happy to learn if and how it is wrong.

EDIT: Another very popular question whose answers argues against the pointer analogy

Bananach
  • 2,016
  • 26
  • 51
  • 1
    Your understand is perfect. – rawwar May 09 '18 at 12:47
  • 1
    Python has names because it's actually variable names that are looked up in a dict at runtime, not a variable id assigned at compilation (which does not happen). While it looks like the same for usage, it is not only a change of terminology implementation-wise. – Olivier Melançon May 09 '18 at 12:50
  • But it is true that the explanation often lacks an example. Something that IS different with names is that in Python you can change where a variable is pointing to in memory by updating globale or locals dicts – Olivier Melançon May 09 '18 at 12:56
  • @Olivier Good points. Still, since they are rather technical points concerning an additional indirection layer, I believe attracting people coming from C and C++ would be much easier if the simple pointer explanation was used – Bananach May 09 '18 at 13:00
  • I do not agree, some consequences are not only technical and actually pretty useful, by example the fact you can catch a `NameError` exception in Python. This is something that would make no sense in C. – Olivier Melançon May 09 '18 at 13:05
  • "simple pointer explanation" :-) Your understanding is ok, although I'm not sure how helpful it will be for non-C users, if you disallow null pointer creation and mention that while `.` looks like it functions like `->` the implementation is very different, especially for methods. – thebjorn May 09 '18 at 13:05

2 Answers2

2

The correct explanation really is that: Python has names, other languages have variables. What this means is that Python's variables are recovered by looking up their name in a dict at runtime.

Although, I'll give you that in my early Python days I too found this explanation disappointing. What I would have liked at the time would have been examples of what this implies for the developper.

Variables are in a dict

In lower-level languages, a variable's name is thrown away and replaced by an address in memory by the compiler. This is not the case in Python, which allows you to access all variables with the vars builtin that returns the dict of the scope's variable.

This means you can read variables from that dict.

foo = 1
vars()['foo'] # 1

And that you can update and even declare a variable using this dict.

vars()['bar'] = 1
bar # 1

It also means you can list all variables in scope.

foo = 1
bar = 1
vars().keys() # ['__name__', '__doc__', ... , 'foo', 'bar'])

The NameError exception

In C, accessing a variable that has not been declared is a compile error. In Python, since variables' names are looked up in a dict, it is a runtime exception that can be caught like any exception.

try:
    print(foo)
except NameError:
    print('foo does not exist') # This is printed

There are no pointers

In Python, the whole mechanic of accessing a value by its position in memory is hidden from you. Your way to access a variable is by knowing its name.

In particular this means that you can delete a name and prevent all access whatsoever to its value.

foo = object()
del foo
# The above object is now completely out of reach

What the above does is remove the 'foo' key in the vars() dict.

Olivier Melançon
  • 21,584
  • 4
  • 41
  • 73
  • I believe you, but I still think these can be considered technical details that are irrelevant to someone who is just learning how Python variables work, or even anyone that is not an expert already. If you want to access variables by local or global dict, you'd better know exactly what you are doing, otherwise you're probably writing terrible code. Also, I believe your last point is wrong, `del` doesn't take the object completely out of reach, it just reduces it's reference count, i.e., it deletes **a pointer** to it. You can still have another local variable pointing to the object. – Bananach May 10 '18 at 06:14
  • In that case, what `del` does is remove the 'foo' entry in the `vars()` dict. Since it's the only reference, the object cannot be referenced, *even if you knew its id*. – Olivier Melançon May 10 '18 at 09:39
  • For the rest, it's up to you to interpret this as you like. But there is a real implementation difference and I showed the implication it brings. – Olivier Melançon May 10 '18 at 09:41
0

It's a bit (well, a lot) more complicated than then that. However, simplistically, yes every python object is a pointer to a PyObject or PyVarObject as defined in object.c

/* Nothing is actually declared to be a PyObject, but every pointer to
 * a Python object can be cast to a PyObject*.  This is inheritance built
 * by hand.  Similarly every pointer to a variable-size Python object can,
 * in addition, be cast to PyVarObject*.
 */
typedef struct _object {
    _PyObject_HEAD_EXTRA
    Py_ssize_t ob_refcnt;
    struct _typeobject *ob_type;
} PyObject;

typedef struct {
    PyObject ob_base;
    Py_ssize_t ob_size; /* Number of items in variable part */
} PyVarObject;
FHTMitchell
  • 11,793
  • 2
  • 35
  • 47
  • I understand that the implementation is more complicated. I want to know if my understanding is enough to correctly predict behavior of variables in Python – Bananach May 09 '18 at 13:06
  • @Bananach: the `ob_refcnt` field is important too. It is the part that allows garbage collection of objects that are no longer referenced. That means that a Python variable and a C `void *` pointer will share some behaviour: both can represent different things at different times, but scope and lifetime are too different for the comparison to really make sence. – Serge Ballesta May 09 '18 at 13:30
  • @Serge why is the scope different? And isn't the reference count mechanism the same as in what C++ called smart **pointers**? – Bananach May 10 '18 at 06:18