0

I have been using pandas library for data manipulation. And I am stuck somewhere while doing the below calculation.

I have below table in my excel file which contains two columns and I need to create third column (cap). I need to do this excel calculation of third column in my program using python pandas.

First I will create the below dataframe DF which contains two columns and I need to create third column in df using Excel formulas in python.

Data frame (DF):-

   Period       rates     
  01-01-2021   0.0028    
  01-02-2021   0.0082   
  01-03-2021   0.0020   
  01-04-2021   0.0043    
  01-05-2021   0.0066   

Excel Table:-

     A             B         C
1   Period       rates     cap 
2   01-01-2021   0.0028    =if(month(A2)=04,1,(1+$B3)*C3)
3   01-02-2021   0.0082    =if(month(A3)=04,1,(1+$B4)*C4)
4   01-03-2021   0.0020    =if(month(A4)=04,1,(1+$B5)*C5)
5   01-04-2021   0.0043    =if(month(A5)=04,1,(1+$B6)*C6)
6   01-05-2021   0.0066    =if(month(A6)=04,1,(1+$B7)*C7)


I have just created the third column (cap) to understand the formula. And I need to do this in my python program.

Yaser
  • 61
  • 6
  • 1
    This post should answer your question https://stackoverflow.com/questions/40353519/how-to-apply-custom-function-to-pandas-data-frame-for-each-row – Ulto 4 Aug 02 '21 at 17:06
  • I would recommend using the 'apply' function from pandas. [https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.apply.html](python doc link) – tcglezen Aug 02 '21 at 17:08
  • I think we will have to use something like cumulative function to do this because it uses the below cell value to get current cell value. – Yaser Aug 02 '21 at 17:11
  • And we will have to reverse this column because it uses the down cell value. If it had normal formula then I would have definitely done this using python pandas. But it's not that easy I think. – Yaser Aug 02 '21 at 17:14
  • In cell C2 it uses the C3 value. Please see the excel formula. – Yaser Aug 02 '21 at 17:16
  • I think I have explained everything. If you have any question or need more explanation then please ask me. Thanks – Yaser Aug 02 '21 at 17:21

1 Answers1

2

Assuming the "period" column in dataframe has already been converted to the datetime object, then by simply defining custom function and using df.apply() would most likely do your job.

Example: (please also change the custom function correctly since I did not include multiply by the cap value below term)

import pandas as pd

df = pd.DataFrame({
        'period': ['01-01-2021', '01-02-2021', '01-03-2021', '01-04-2021', '01-05-2021'],
        'rates': [0.0028, 0.0082, 0.0020, 0.0043, 0.0066]
    })

def cap_criteria(row):
    if row['period'].month == 4:
        return 1
    else:
        return (1 + row['rates'])

df['cap'] = df.apply(cap_criteria, axis=1)

print(df)

Output:

      period   rates     cap
0 2021-01-01  0.0028  1.0028
1 2021-02-01  0.0082  1.0082
2 2021-03-01  0.0020  1.0020
3 2021-04-01  0.0043  1.0000
4 2021-05-01  0.0066  1.0066

If the "period" column is in string format, you can converted it by:

df['period'] = pd.to_datetime(df['period'], format='%d-%m-%Y')

or

df['period'] = pd.to_datetime(df['period'], format='%m-%d-%Y')

depending on which date convention you are using.

SayYoungMan
  • 33
  • 1
  • 6
  • Thanks. But could you please explain the else part in your program. Actually in my given excel formula I am using (1+$B3)*C3) and I think it should all be in else part of your program but there is only 1 + row['rates']. – Yaser Aug 02 '21 at 17:36
  • And second thing in my cell C2 I am using down cell value which is C3. – Yaser Aug 02 '21 at 17:39
  • Good one SayYoungMan +1. – Karn Kumar Aug 02 '21 at 17:43