I'm experimenting with Python/Pandas using a DataFrame having the following structure:
import pandas as pd
import numpy as np
df = pd.DataFrame({"item" : ["A", "B", "C", "D", "E"],
"size_ratio" : [0.3, 0.9, 1, 0.4, 0.7],
"weight_ratio" : [0.5, 0.7, 1, 0.5, np.nan],
"power_ratio" : [np.nan, 0.3, 0.5, 0.1, 1]})
print(df)
item size_ratio weight_ratio power_ratio
0 A 0.3 0.5 NaN
1 B 0.9 0.7 0.3
2 C 1.0 1.0 0.5
3 D 0.4 0.5 0.1
4 E 0.7 NaN 1.0
As you can see, each item is described by three normalized metrics, namely: size_ratio
, weight_ratio
, and power_ratio
. Also, NaN
values are possible for each metric.
My goal is to combine these metrics together to create a global score (S) for each row. Specifically, the function I would like to apply/implement is the following:
where
- s_i are the individual scores;
- w_i are user-defined weights associated to each metric;
- alpha is a user-defined parameter (positive integer).
I want to be able to quickly adjust the weights and the parameter alpha to test different combinations.
As an example, setting w_1 = 3, w_2 = 2, w_3 = 1 and alpha = 5, the output should be the following:
item size_ratio weight_ratio power_ratio global_score
0 A 0.3 0.5 NaN 0.36
1 B 0.9 0.7 0.3 0.88
2 C 1.0 1.0 0.5 0.99
3 D 0.4 0.5 0.1 0.44
4 E 0.7 NaN 1.0 0.70
Note that for the denominator, we only sum the weights associated to the non-missing metrics (same logic goes for the numerator).
Being relatively new to the Python programming language, I started by searching for answers here. In this post, I learned how to compute row-wise operation on a pandas DataFrame with missing values; and in this post, I saw an example where one uses a dictionary to set the weights.
Unfortunately, I was not able to apply what I found to my specific problem. Right now, I'm using Excel to make different simulations but I would very much like to experiment with this in Python. Any help would be greatly appreciated.