0

Pandas newbie trying to replicate sql to python. Referencing the below post, I could use a simple function to calculate the weighted average of a column in a pandas dataframe.

Calculate weighted average using a pandas/dataframe

  Date        ID      wt      value   
  01/01/2012  100     0.50    60      
  01/01/2012  101     0.75    
  01/01/2012  102     1.00    100
  01/02/2012  201     0.50    
  01/02/2012  202     1.00    80

However, if I had conditions in both numerator and denominator and to get an aggregate of the weighted average, I would do the below in sql:

 SELECT
    date
    , id
    , SUM(CASE WHEN value IS NOT NULL THEN value * wt ELSE 0 END) /
        NULLIF(SUM(CASE WHEN value > 0 THEN wt ELSE 0 END), 0)
    AS wt_avg

 FROM table
 GROUP BY date, id

How would we replicate this in Pandas?

Thanks in advance.

Lucky
  • 333
  • 1
  • 3
  • 14

1 Answers1

1

Consider using calculated, helper columns according to specified logic with np.where() replacing the CASE statements and Series.fillna() as counterpart to NULLIF.

df['numer'] = np.where(pd.notnull(df['value']), df['value'] * df['wt'], 0)
df['denom'] = pd.Series(np.where(df['value'] > 0, df['wt'], 0)).fillna(0)

df['wt_avg'] = (df.groupby(['Date', 'ID'])['numer'].transform(sum) /  
                    df.groupby(['Date', 'ID'])['denom'].transform(sum))

print(df)
# print(df.drop(columns=['numer', 'denom']))      # DROP HELPER COLUMNS

#          Date   ID    wt  value  numer  denom  wt_avg
# 0  01/01/2012  100  0.50   60.0   30.0    0.5    60.0
# 1  01/01/2012  101  0.75    NaN    0.0    0.0     NaN
# 2  01/01/2012  102  1.00  100.0  100.0    1.0   100.0
# 3  01/02/2012  201  0.50    NaN    0.0    0.0     NaN
# 4  01/02/2012  202  1.00   80.0   80.0    1.0    80.0
Parfait
  • 104,375
  • 17
  • 94
  • 125
  • Thank you @Parfait - I will try the above. So, if we have multiple columns, we would just extend the same right? So, it would be wt_avg1, wt_avg2 and so on.. – Lucky Jun 21 '18 at 21:44
  • Great! Glad to help. Happy coding! – Parfait Jun 22 '18 at 18:55