I have the following DataFrame where the CMO _ ($ / MWH) is a finantial value with needs to be rounded using the old rounding method (e.g.: 754.275 must be rounded to 754.28):
INIT_DATE INIT_HOUR CMO_($/MWH)
0 2020-12-01 00:00:00 754.275
1 2020-12-01 01:00:00 728.130
2 2020-12-01 02:00:00 722.575
3 2020-12-01 03:00:00 722.045
4 2020-12-01 04:00:00 721.950
5 2020-12-01 05:00:00 721.035
6 2020-12-01 06:00:00 722.100
7 2020-12-01 07:00:00 739.925
8 2020-12-01 08:00:00 771.390
9 2020-12-01 09:00:00 797.415
10 2020-12-01 10:00:00 796.585
11 2020-12-01 11:00:00 791.875
12 2020-12-01 12:00:00 782.225
13 2020-12-01 13:00:00 783.540
14 2020-12-01 14:00:00 790.980
15 2020-12-01 15:00:00 815.555
16 2020-12-01 16:00:00 824.760
17 2020-12-01 17:00:00 782.265
18 2020-12-01 18:00:00 779.970
19 2020-12-01 19:00:00 784.640
20 2020-12-01 20:00:00 785.380
21 2020-12-01 21:00:00 785.840
22 2020-12-01 22:00:00 779.775
23 2020-12-01 23:00:00 763.775
Since Python 3+ uses the Bank's Rounding rule, I implemented a custom round function that makes the old rounding method:
def _old_round(n):
"""
Since Python 3+ the rounding is made through the Banker's Round Rule,
this function uses the old rounding method (round up if 0.5).
Args:
n (float): A number with 3 thousands values after decimal point
Returns:
(float) : The rounded number now with 2 decimal places, but using
the old round method
Example:
>>> _old_round(782.225)
>>> 782.23
Additional:
For more information about rounding at Python 3, go to:
https://stackoverflow.com/questions/10825926/python-3-x-rounding-behavior
"""
# * Multiply ou number by 100 since we want to round the 2 decimal place
new_n = n * 100
# * Round using the Decimal lib, but using the old method
n_dec = Decimal(new_n).quantize(Decimal('1'), rounding=ROUND_HALF_UP)
# * Roll back to original format
return float(n_dec / 100)
The function works normally when I use only numbers (see function example above), but when I use the Pandas apply or map to perform the calculation in the CMO _ ($ / MWH) column, some values are not rounded correctly (index 12), does anyone know why? Find the DF after calculation bellow:
df['ROUNDED_CMO'] = df['CMO_($/MWH)'].apply(_old_round)
df
Results:
INIT_DATE INIT_HOUR CMO_($/MWH) NEW_CMO
0 2020-12-01 00:00:00 754.275 754.28
1 2020-12-01 01:00:00 728.130 728.13
2 2020-12-01 02:00:00 722.575 722.58
3 2020-12-01 03:00:00 722.045 722.05
4 2020-12-01 04:00:00 721.950 721.95
5 2020-12-01 05:00:00 721.035 721.04
6 2020-12-01 06:00:00 722.100 722.10
7 2020-12-01 07:00:00 739.925 739.93
8 2020-12-01 08:00:00 771.390 771.39
9 2020-12-01 09:00:00 797.415 797.42
10 2020-12-01 10:00:00 796.585 796.59
11 2020-12-01 11:00:00 791.875 791.88
12 2020-12-01 12:00:00 782.225 782.22 <
13 2020-12-01 13:00:00 783.540 783.54
14 2020-12-01 14:00:00 790.980 790.98
15 2020-12-01 15:00:00 815.555 815.56
16 2020-12-01 16:00:00 824.760 824.76
17 2020-12-01 17:00:00 782.265 782.27
18 2020-12-01 18:00:00 779.970 779.97
19 2020-12-01 19:00:00 784.640 784.64
20 2020-12-01 20:00:00 785.380 785.38
21 2020-12-01 21:00:00 785.840 785.84
22 2020-12-01 22:00:00 779.775 779.78
23 2020-12-01 23:00:00 763.775 763.78
The value at index 12 should be: 782.23 and not 782.22, the function works for all the data but not for this specific index for some reason... I'm using a Miniconda Environment (Windows 10) with pandas 1.1.3, Python 3.6.12, jupyter-notebook 6.1.4 (I'm running the code here).
Obs: I already tried to use Math.Ceil, Numpy.Ceil, Decimal Package Round Options, and the solutions at: Question 1, Question 2, Question 3, none of them work properly. I also already tried (the index 12 still not rounding):
df['ROUNDED_CMO'] = df['CMO_($/MWH)'].map(lambda x : _old_round(x))
df['ROUNDED_CMO'] = df['CMO_($/MWH)'].apply(lambda x : _old_round(x))