I've a simple Pandas Dataframe with four columns:
NRAPPOR; DSCARAT; IQTACAP; IQTAINT
2; 2006-12-31; 0.00; 60.00
2; 2007-01-31; 270.75; 150.05
2; 2007-02-28; 272.78; 148.02
2; 2007-03-31; 274.82; 145.98
2; 2007-04-30; 276.88; 143.92
... ... ... ...
5731; 2016-11-17; 1760.00; 240.00
5731; 2018-11-17; 1800.00; 200.00
5731; 2019-11-17; 1850.00; 150.00
5731; 2020-11-17; 1900.00; 100.00
5731; 2021-11-17; 1950.00; 50.00
where:
- NRAPPOR = loan ID
- NSCARAT = installment expiry date
- IQTACAP = Principal portion of installment
- IQTAINT = Interset portion of installment
For each NRAPP, I'd like to sum of IQTACAP and IQTAINT values in four distinct totalizers according to whether or not DSCARAT is less than the threshold date ('2020-03-17'
I want to sum IQTACAP in totCapOverdue if DSCADRAT is <= than the threshold date ('2020-03-17'
I want to sum IQTACAP in totCapToExpire if DSCADRAT is > than the threshold date ('2020-03-17'
I want to sum IQTAINT in totIntOverdue if DSCADRAT is <= than the threshold date ('2020-03-17'
I want to sum IQTAINT in totIntToExpire if DSCADRAT is > than the threshold date ('2020-03-17'
I'd like to obtain a New DF with 5 columns; NRAPPOR and the four totalizers
'This is my barbaric code:
'set threshold date
dataSoglia = '2020-03-17'
totCapOverdue = 0
totIntOverdue = 0
totCapToExpire = 0
totIntToExpire = 0
rapportoPrev = 0
for index, row in df1.iterrows():
'if NRAPPORT changes, I print the totalizer
'I would prefer to obtain a new Dataframe with NRAPPOR and the four totalizer as new columns
if((index[0]!=rapportoPrev) & (rapportoPrev!=0)):
print(rapportoPrev,'\t', 'capOverdue: ', totCapOverdue, '\t', 'intOverdue: ', totIntOverdue, '\t','capToExpire: ', totCapToExpire, '\t', 'intpToExpire: ', totIntToExpire)
'set totalizer to zero
totCapOverdue = 0
totIntOverdue = 0
totCapToExpire = 0
totIntToExpire = 0
if (index[1].strftime("%Y-%m-%d") <= dataSoglia):
totCapOverdue += row['IQTACAP']
totIntOverdue += row['IQTAINT']
else:
totCapToExpire += row['IQTACAP']
totIntToExpire += row['IQTAINT']
rapportoPrev = index[0]
dataPrev=index[1]
This is my output:
2 capOverdue: 19999.999999999993 intOverdue: 4887.200000000001 capToExpire: 0 intpToExpire: 0
3 capOverdue: 123156.18000000002 intOverdue: 70519.02 capToExpire: 26843.820000000003 intpToExpire: 1528.9799999999996
4 capOverdue: 30000.0 intOverdue: 4965.180000000001 capToExpire: 0 intpToExpire: 0
5 capOverdue: 6000.000000000002 intOverdue: 167.1 capToExpire: 0 intpToExpire: 0
6 capOverdue: 18000.0 intOverdue: 2111.89 capToExpire: 0 intpToExpire: 0
7 capOverdue: 50000.00000000003 intOverdue: 8104.3 capToExpire: 0 intpToExpire: 0
8 capOverdue: 50000.00000000003 intOverdue: 15711.999999999996 capToExpire: 0 intpToExpire: 0
9 capOverdue: 70000.0 intOverdue: 18213.110000000004 capToExpire: 0 intpToExpire: 0
...
'Is there a better way to di that?
Thanks