1

Nested np.where can replace values based on specified condition. However, as I understand it, df.where (Pandas DataFrame.where) is native to pandas.
Unlike np.where, the df value is returned when the condition is true and the specified alternate when false.

''' #snippet of data: df_age
     0  1
0   20  3
1   23  4
2   26  5
3   29  2
4   NaN 1
5   NaN 2
6   NaN 3
7   NaN 0
'''
## define function to check NaN/null age and assign age
def impute_age(cols):
    age = cols[0] # assign first column to age Series
    #print(f'age: {age}') #debug
    pclass = cols.loc[:,1] # assign 2nd to pclass
    
    '''
    ## Nested np.where that works as expected 
    age_null = np.where((cols[0].isnull()) & (cols[1]==1), 37,
                        np.where((cols[0].isnull()) & (cols[1]==2), 39,
                        np.where((cols[0].isnull()) & (np.logical_or(cols[1] != 1, cols[1] != 2)), 24,   
                                      cols[0]))).astype(int)
    '''
    
    ## nested pd.where
    age_null = age.where((cols[0].notnull()) & ... ... ... )
    
    print(f'age col[0]: \n{age_null}')

impute_age(df_age)

Q1: What is the more pythonic way for the working nested np.where
Q2: [Main question] How do I write the df.where (age.where) to achieve the same as the np.where

NB: I'll be testing out using mask at a later stage.
NB: I used nested np.where to replace nested if ... NB: I'm trying out pd.where, and later mask, map, substitute for demonstration purposes and, at the same time, how best to write them (pythonically)


NB: I took note of the following:
SO: df.where
SO: truth value
np.where
pd.where
SO: pd.where col change
SO: pd.where nested loop
SO: pd.where covert values
SO: pd.where elif logic
SO: pd.where nested?

semmyk-research
  • 333
  • 1
  • 9

1 Answers1

1

Don't use np.where/df.where, let's simplify your code with map and fillna

Here is the example,

s = cols[1].map({1: 37, 2: 39}).fillna(24)
age_null = cols[0].fillna(s)
Shubham Sharma
  • 68,127
  • 6
  • 24
  • 53
  • Thanks @Shubham The .map works like a breeze. #impressive. Two things. 1. Would you be kind to expatiate further on "don't use df.where (though I kind of see challenges with df.where the more I tried to get it working for the exemplar I'm working on). 2. Would you mind showing how df.where can be used nonetheless for this exemplar? Many thanks – semmyk-research Dec 02 '22 at 06:54