3

I was checking how does global keyword work for a project when I run by mistake the CODE1, which worked in a way that I did not expect. The global keyword takes effect in the function even if it is not in an executed part (i.e. in a if which condition is not true).

I have been looking around for questions about the global keyword in python but I could not find a response to this. I saw: Python global keyword behavior, global keyword in python, Using global variables in a function

The most interesting, and I think it may have to do something with it is this one (but I am not sure): Python global keyword

Below, the three Minimal Reproducible Examples of the code I used are shown:

CODE1 (with global keyword):

a = 0
def my_function():
    b=2
    if b==0:
        print("first if")
        global a
    if b==2:
        print("second if")
        a = 2
        print("func -> a", a)
        print("func -> b", b)

if __name__ == '__main__':
    my_function()
    print("main -> a", a)

Result:

second if
func -> a 2
func -> b 2
main -> a 2

CODE2 (without global keyword):

a = 0
def my_function():
    b=2
    if b==0:
        print("first if")
    if b==2:
        print("second if")
        a = 2
        print("func -> a", a)
        print("func -> b", b)

if __name__ == '__main__':
    my_function()
    print("main -> a", a)

Result:

second if
func -> a 2
func -> b 2
main -> a 0

CODE3 (with global keyword, but inverted if statements):

a = 0
def my_function():
    b=2
    if b==2:
        print("second if")
        a = 2
        print("func -> a", a)
        print("func -> b", b)
    if b==0:
        print("first if")
        global a

if __name__ == '__main__':
    my_function()
    print("main -> a", a)

Result:

  File "global_var_test.py", line 18
    global a
    ^
SyntaxError: name 'a' is used prior to global declaration

As it can be seen, if b==0: is always False and if b==2: is always True (print confirms it). I would expect that CODE1 gives the same result as CODE2 as global a would not be executed in the first example, so it would be the same than ommiting it. But it gives an innesperate result, in which the global keyword takes effect anyway and the global variable a is changed to value 2. After this, I tested with CODE3 thinking the global keyword would be visible in all the function regardless of its position, and then CODE3 should give the same result as CODE1. Again I was wrong, it worked like if global a was going to be executed (and then it was after the asignation and an exception is raised).

Then, my final question is: ¿does the global keyword (and maybe others like nonlocal, etc.) have visibility in the code in the order that is written but independently of what is being executed?

Please help me in clarifying this.

Ganathor
  • 121
  • 1
  • 8
  • 4
    The entire function is parsed, and the storage type (global/local/nonlocal) determined for all variables, before any code is generated. (It *has* to be this way, because the storage type changes how variable accesses are performed.) So yes, a `global` declaration, or the fact that a variable is assigned within a function, can have an effect earlier in the function. – jasonharper Oct 28 '19 at 14:26
  • Another example of this is when you try to read a global variable but later in the same function assign to it without declaring it as global. – Daniel Roseman Oct 28 '19 at 14:32

1 Answers1

4

My answer on this question may help understand some of the technical details here, although this is a subtly different question.

In short, as you discovered, the Python compiler will basically determine the scope of a variable based on how it first sees it used inside the function; this is regardless of details like control statements, so if it happens to encounter an assignment like a = 2 before seeing the global statement, it will decide a is a local variable. If you try inverting the code (you didn't give an example quite like this) such that the compiler happens to see the global statement first it will work (albeit still be bad code):

a = 0
def my_function():
    b=2
    if b==2:
        print("second if")
        global a
        print("func -> a", a)
        print("func -> b", b)
    if b==0:
        print("first if")
        a = 2

So for both practical/technical, as well as stylistic purposes you should always declare global (or nonlocal) variables at the beginning of a function and not anywhere else.

I'm not sure if this is a language requirement or a detail of CPython; this would be an interesting follow-up question.

Update: Yes, this is a language specification requirement; see https://docs.python.org/3/reference/simple_stmts.html#grammar-token-global-stmt

Names listed in a global statement must not be used in the same code block textually preceding that global statement.

Here textually preceding just means in terms of the text of the code, regardless of surrounding details such as control statements. This is because global is actually a directive to the parser, which determines whether or not a variable has local or global binding based on how it first sees that variable used. Though in terms of implementation details that still might not be exactly accurate; e.g. CPython builds a symbol table for a code module as a separate pass over the AST returned from the parser. Thus the textual order of the code will also impact the order over which nodes in the AST are traversed. E.g. you can see where your error message came from in the visitor for global statements.

Iguananaut
  • 21,810
  • 5
  • 50
  • 63