-3

I am getting UnboundLocalError for one of the variables but not the other one.

This is the code that confuses me:

tokens=[]
token_count = 0

def extend():
    for word in line.split():
        tokens.append(word)
        token_count += 1

lines = ['this is one line', 'this is another line']
for line in lines:
    extend()
    print(tokens)

Running this you would get the error:

UnboundLocalError: local variable 'token_count' referenced before assignment

But removing token_count += 1 line makes the code run fine. I understand that this can be fixed by adding a global token_count after function definition.

What I have trouble understanding is why Python is not complaining about the other variables (e.g., line in for word in line.split() and tokens). These variables are all defined outside this function and if I'm getting Local variable error for one, I should get it for the other too.

Any explanation for this behavior?

CentAu
  • 10,660
  • 15
  • 59
  • 85

1 Answers1

1

Variables inside a function will be inherited from an outer scope if not assigned within the function. However, if you do assign within a function, then that variable is local (unless you use a nonlocal or global statement as appropriate), and you cannot use its value before you have assigned it. Where you are doing token_count += 1, you are trying to look up the existing value (in order to add 1 to it) before any value has been assigned within the function, and this will fail.

It is not normally good practice to make use of global variables; a better to the problem would be to return a value. For example:

def extend(tokens, line, token_count):
    "extend tokens by the words in line; also return new token count"
    for word in line.split():
        tokens.append(word)
        token_count += 1
    return token_count

lines = ['this is one line', 'this is another line']
for line in lines:
    token_count = extend(tokens, line, token_count)
    print(tokens)

This function does not make any use of variables inherited from an outer scope; everything is local to the function or a function parameter.

In the case of token_count, it is a parameter, but you are reassigning it within the function. This is fine, but note that it will not change the value of token_count in the main program. This is done later, by assigning it to the returned value in the main program (see token_count = extend(...) line).

In the case of tokens, you are mutating an existing mutable object (a list) by calling its append method, not reassigning it (you are not doing tokens = ...). The updated value is visible also in the main program.

Note that here, the function both modifies a mutable argument (tokens) and also returns a value. This is a little bit unusual, so to avoid any confusion it is good to state explicitly that this is what it is doing - hence the docstring at the start of the function.

If you need to return more than one value in this way, then pack them into a tuple, which you can unpack in the main program.


Note: in this case, the value of token_count could instead simply be obtained using len(tokens) rather than have a separate variable for it at all, but for sake of example, the code does not make use of that fact.

alani
  • 12,573
  • 2
  • 13
  • 23