first of all: thank you for all the questions and answers. So far, I always found a solution to my problems here. However, with the following problem I'm stuck:
I have a dataframe as this:
Jan_x Feb_x Mar_x Apr_x ... driest driest_rr DMAI Station_id
0 -433 -398 -18 508 ... Mar_x 2684 37.189000 2
1 -95 -102 164 631 ... Mar_x 2732 30.568445 10
2 59 272 691 1165 ... Jan_x 1970 40.237462 12
3 30 239 696 1108 ... Feb_x 3548 43.941148 13
4 -1128 -1193 -985 -667 ... Feb_x 12715 334.828246 15
(995 rows in total)
The first 12 columns are monthly mean temperature values (in 0.01 degrees), the last column ('Station_id') is an identifier for climate stations. From another dataframe containing precipitation data I got the driest month ('driest') and it's precipitation amount ('driest_rr'; in 0.01 mm). Finally, 'DMAI' is an annual aridity index already calculated in the step before. Now I want to compute another Aridity Index (for meteorologists/climate scientists: the Pinna Combinative Index) that includes both annual mean temperature and precipitation (already included in 'DMAI') and mean temperature and precipitation of the driest month. The equation is:
DMAI = P/(T+10)
PCI = 0.5 (DMAI+(12Pd/Td+10))
with P,T annual mean temperature and precipitation and Pd,Td mean temperature and precipitation of the driest month (in mm and °C respectively)
I already have:
df['PCI'] = 0.5 * (df.loc[:,'DMAI'] +(12*(df.loc[:,'driest_rr']/100)))/(df.loc[:,'Mar_x']+10))
which works. However, the driest month is not always March, I need the one specified in the column 'driest'.
df['PCI'] = 0.5 * (df.loc[:,'DMAI'] +(12*(df.loc[:,'driest_rr']/100)))/(df.loc[:,df_dmai.loc[:,'driest']]+10))
does not work however.
Is there a way to solve this?
I found a few similar question, like this one here:
How can I select a specific column from each row in a Pandas DataFrame?
However, the answers that I found use either the deprecated df.lookup()
or a numpy workaround, so they don't help me in this case.