Variables inside a function will be inherited from an outer scope if not assigned within the function. However, if you do assign within a function, then that variable is local (unless you use a nonlocal
or global
statement as appropriate), and you cannot use its value before you have assigned it. Where you are doing token_count += 1
, you are trying to look up the existing value (in order to add 1 to it) before any value has been assigned within the function, and this will fail.
It is not normally good practice to make use of global variables; a better to the problem would be to return a value. For example:
def extend(tokens, line, token_count):
"extend tokens by the words in line; also return new token count"
for word in line.split():
tokens.append(word)
token_count += 1
return token_count
lines = ['this is one line', 'this is another line']
for line in lines:
token_count = extend(tokens, line, token_count)
print(tokens)
This function does not make any use of variables inherited from an outer scope; everything is local to the function or a function parameter.
In the case of token_count
, it is a parameter, but you are reassigning it within the function. This is fine, but note that it will not change the value of token_count
in the main program. This is done later, by assigning it to the returned value in the main program (see token_count = extend(...)
line).
In the case of tokens
, you are mutating an existing mutable object (a list) by calling its append
method, not reassigning it (you are not doing tokens = ...
). The updated value is visible also in the main program.
Note that here, the function both modifies a mutable argument (tokens
) and also returns a value. This is a little bit unusual, so to avoid any confusion it is good to state explicitly that this is what it is doing - hence the docstring at the start of the function.
If you need to return more than one value in this way, then pack them into a tuple, which you can unpack in the main program.
Note: in this case, the value of token_count
could instead simply be obtained using len(tokens)
rather than have a separate variable for it at all, but for sake of example, the code does not make use of that fact.