I am new to pandas. I have written a function which I want to apply to all entries in a pandas DataFrame "Monthly_mean_consump" for which a sample of data is provided below.
Time 2010-08-31 2010-09-30 2010-10-31 2010-11-30 2010-12-31 2011-01-31 2011-02-28 2011-03-31 2011-04-30
00:00 0.429188 0.302406 0.587415 0.227820 0.115938 0.170616 0.056256 0.078167 0.476515
00:30 0.340452 0.385037 0.218495 0.238118 0.134938 0.123279 0.054984 0.108111 0.173700
01:00 0.455451 0.433700 0.229352 0.253046 0.391425 0.313715 0.401116 0.123304 0.453640
01:30 0.601494 0.576142 0.425599 0.590769 0.486930 0.419002 0.560737 0.554705 0.544313
02:00 0.504342 0.584523 0.614539 0.375294 0.377951 0.342675 0.357372 0.367332 0.391336
02:30 0.527724 0.443303 0.457283 0.369515 0.392317 0.379410 0.391916 0.444807 0.491411
03:00 0.429236 0.531203 0.464098 0.370421 0.426875 0.360209 0.276282 0.179577 0.304990
03:30 0.442019 0.510604 0.314080 0.372268 0.443884 0.461649 0.390262 0.284042 0.417354
04:00 0.359956 0.411183 0.390059 0.398598 0.397369 0.351005 0.534459 0.317548 0.309686
04:30 0.352609 0.247238 0.421061 0.385950 0.300075 0.200626 0.402874 0.126153 0.261859
I have written a function whose input is the individual element in dataframe "Monthly_mean_consump" and wherever this value is larger than 0.45, the function finds corresponding cost from another DataFrame "df_loss_peak" (a sample is provided below) and returns corresponding cost.
Peak_Consumption Cost
0 0.450000 1.413333
1 0.557895 4.651579
2 0.665789 7.133158
3 0.773684 9.614737
4 0.881579 12.096316
5 0.989474 14.577895
6 1.097368 17.059474
7 1.205263 19.541053
8 1.313158 22.022632
9 1.421053 24.504211
10 1.528947 26.985789
11 1.636842 29.467368
12 1.744737 31.948947
13 1.852632 34.430526
14 1.960526 36.912105
15 2.068421 39.393684
16 2.176316 41.875263
17 2.284211 44.356842
18 2.392105 46.838421
19 2.500000 49.320000
The function which I have written is following:
def cost_consump(mean_consump):
if(mean_consump >= 0.45):
for i in range(0, len(df_loss_peak)):
if(df_loss_peak["Peak_Consumption"][i] > mean_consump):
cost = df_loss_peak["Cost"][i]
return cost
break;
return 0
I have used two nested for loops (code is below) to apply this function to each elements of the dataframe "Monthly_mean_consump" and I am getting the desired output.
Monthly_mean_cost = pd.DataFrame.copy(Monthly_mean_consump,deep=True)
for i in range(0,48):
for j in range(0,12):
mean_consump = Monthly_mean_consump.transpose().iloc[i,j]
Monthly_mean_cost.transpose().iloc[i,j] = cost_consump(mean_consump)
However, I am aware that this is quite time consuming and pandas provides an efficient fuction "applymap()" which is much suited for this job. I have used the applymap() function in various other taks. But when I write the following code in order to use applymap() here, I get keyindex error -1:
Monthly_mean_consump.transpose().applymap(cost_consump)
Any help would be appreciated? Also am I right in thinking that applymap() function would be much efficient than the loop which I have implemented?
Thanks