2

Can one use np.around to the nearest n-th decimal places?

import pandas as pd
import numpy as np

df=pd.DataFrame({'num':  [0.444, 0.445, 0.446, 0.4, 0.41, 0.49],
                 'near': [0.44, 0.45, 0.45, 0.4, 0.41, 0.49]})

df
Out[199]: 
     num  near
0  0.444  0.44
1  0.445  0.45
2  0.446  0.45
3  0.400  0.40
4  0.410  0.41
5  0.490  0.49

np.around - nok

np.around(df['num'], 2)    
Out[185]: 
0    0.44
1    0.44  # nok
2    0.45
3    0.40
4    0.41
5    0.49
Name: num, dtype: float64

built-in round on single digit- ok

round(0.445,2)

Out[216]: 0.45  # ok

built-in round on column - nok

round(df['num'],2)

Out[218]: 
0    0.44
1    0.44  # nok
2    0.45
3    0.40
4    0.41
5    0.49
Name: num, dtype: float64

built-in round via lambda on each column cell - ok

df['num'].apply(lambda x: round(x,2))

Out[219]: 
0    0.44
1    0.45  # ok
2    0.45
3    0.40
4    0.41
5    0.49
Name: num, dtype: float64
aeiou
  • 337
  • 1
  • 7

1 Answers1

1

Take a look at the Rounding article on Wikipedia to see the many different rounding rules. The two that are most commonly taught in schools are "round half-up" and "round half away from zero":

Half-up             : 1.5 -> 2   -1.5 -> -1
Half-away from zero : 1.5 -> 2   -1.5 -> -2

Python's round uses the "half-away from zero" rule.

np.around does neither - it rounds to the nearest even integer. This is documented in the function's notes so rounding 0.445 to 0.44 is the expected behavior. The IEEE 754 standard also uses this rule.


You can roll your own rounding function:

def my_round(a: np.array, decimals: int) -> np.array:
    factor = 10**decimals
    b = np.abs(a) * factor
    frac = b - np.floor(b)
    return np.where(frac < 0.5, np.floor(b), np.ceil(b)) / factor * np.sign(a)

my_round(df["num"], 2)
Code Different
  • 90,614
  • 16
  • 144
  • 163
  • Is this function faster / more accurate than built-in round: `df['num'].apply(lambda x: round(x,2))`? – aeiou Oct 05 '22 at 08:14
  • It seems, when applied to `pd.DataFrame({'num': [random.uniform(10,100) for _ in range(100000000)]})`. – aeiou Oct 05 '22 at 08:19
  • 1
    You should avoid `apply` if you can. It uses slow Python loop. Numpy code is vectorized – Code Different Oct 05 '22 at 11:58
  • "Python's round uses the "half-away from zero" rule." <- This isn't true, at least for Python 3. It uses round-ties-to-even. See docs here: https://docs.python.org/3/library/functions.html#round, particularly the "rounding is done toward the even choice " wording. – Mark Dickinson Oct 05 '22 at 15:52