0

I'm having a strange scope problem. The following code is a cell in a Jupyter Notebook. I have assigned p1_total_value and p2_total_value within correlation_check(). Then, later in correlation_check(), project_value(row) is defined and then finally applied to a DataFrame.

When I run this code, I get an UnboundLocalError saying that p1_total_value was referenced before assigment. If I add Global p1_total_value to project_value(row), it tells me that p1_total_value is undefined. It seems like somehow the pandas.apply() is happening before the code that precedes it in correlation_check().

# Check for correlation between the values deteremined in this analysis and the final scores of
# the games from in_depth_games

def correlation_check(game):
    
    player_count = game.Player.max()
    game_len = (max([int(row) for row in game.Generation if row not in ['Last', 'Final']]))
    
    calculated_values = all_gens_full_db(player_count)
    
    if player_count == 2:
        compression = 13/game_len
    elif player_count == 3:
        compression = 11/game_len
    elif player_count in [4, 5]:
        compression = 10/game_len
        
    p1_total_value = 0
    p2_total_value = 0
    
    p1_final_score = game.iloc[-2].Action
    p2_final_score = game.iloc[-1].Action
        

    def project_value(row):
        
        if 'played' in row.Action:
            project_name = row.Action.split('played ')[1]
            gen = round(compression * int(row.Generation))
            project_row = calculated_values[calculated_values.Title == project_name]
            value = int(project_row[f'Value Gen{gen}'])
            
            if int(row.Player) == 1:
                p1_total_value += value
            else: p2_total_value += value
            

            
    game = game.apply(project_value, axis=1)
    #print(f'''
    #        Player 1 total project value: {p1_total_value}
    #        Player 2 total project value: {p2_total_value}
    #        \n
    #        Player 1 final score: {}''')
        
    
correlation_check(in_depth_game_1)
Jonathan M
  • 71
  • 4
  • I recommend you replace `p1_total_value` and `p2_total_value` with a list of two entries. That way, you can modify the list members in the function, which will work. That eliminates your namespace problems. – Tim Roberts May 02 '21 at 06:56
  • Use the `nonlocal` statement – juanpa.arrivillaga May 02 '21 at 07:05
  • @TimRoberts Making a list with two entries worked great. While I appreciate a solution, do you happen to have any idea why the inner function doesn't see the variable assigned in the outer function even if I use `Global` in the inner function? I imagine it must be some idiosyncracy with pandas.apply(). – Jonathan M May 02 '21 at 07:09
  • @juanpa.arrivillaga That raises the same error as using a Global statement. It says that the variables in question are not defined. – Jonathan M May 02 '21 at 07:14
  • No, it doesn't. Where and what exactly are you doing? You need to do `nonlocal p1_total_value, p2_total_value` the first line of `project_value` – juanpa.arrivillaga May 02 '21 at 07:15
  • BTW, you *really* shouldn't be using apply for side-effects like this. – juanpa.arrivillaga May 02 '21 at 07:17
  • @juanpa.arrivillaga The code you described above is exactly what I added and it said that those variables were not defined. As for using apply() like this, I've seen that mentioned once before. Mind pointing me towards the better option? df.applymap()? – Jonathan M May 02 '21 at 07:20
  • You must be doing something wrong, this is the exact use-case for `nonlocal`. Can you provide a [mcve]? – juanpa.arrivillaga May 02 '21 at 07:21
  • My mistake, I must have had a typo when I first tried it. Tried to make an mwe, no error. Went back and tried it again in my code, no error. – Jonathan M May 02 '21 at 07:39

2 Answers2

0

Kindly assign the p1_total_value out of the function, and then global p1_total_value inside the function.

0

You should replace p1_total_value and p2_total_value with a list of two entries. That way, you can modify the list members in the function, which will work. That eliminates your namespace problems. The same technique can be applied to the final scores, too.

        
    total_value = [0,0]
    
    p1_final_score = game.iloc[-2].Action
    p2_final_score = game.iloc[-1].Action
        

    def project_value(row):
        
        if 'played' in row.Action:
            project_name = row.Action.split('played ')[1]
            gen = round(compression * int(row.Generation))
            project_row = calculated_values[calculated_values.Title == project_name]
            value = int(project_row[f'Value Gen{gen}'])
            
            if int(row.Player) == 1:
                total_value[0] += value
            else:
                total_value[1] += value
            
    game = game.apply(project_value, axis=1)
Tim Roberts
  • 48,973
  • 4
  • 21
  • 30
  • I don't have enough rep to upvote this, but this is the solution I'm using. It's unnecessary though for `p1_final_score` and `p2_final_score` as they are not accessed within the inner function. – Jonathan M May 02 '21 at 07:17
  • You really should be using `nonlocal`, this is a bit of a hack. – juanpa.arrivillaga May 02 '21 at 07:18
  • @juanpa.arrivillaga I disagree. Leaking values through `nonlocal` is a hack, mostly because Python's namespace rules are a bit opaque. – Tim Roberts May 02 '21 at 23:51
  • They aren't opaque at all. Assignment is always default local, unless you use a `global` or `nonlocal` directive – juanpa.arrivillaga May 03 '21 at 00:01
  • Your approach relies on implicit namespace resolution at runtime, using `nonlocal` will resolve this at compile time, and fail to even compile your code if there is no enclosing scope with those names – juanpa.arrivillaga May 03 '21 at 00:07