4

I grabbed a row from a dataframe which is like the following:

https://i.stack.imgur.com/Y9LUE.png

or

Clicks  Spend   clk_ar  CPC     AdRank  temp    tempRan
36.0    248.76  59.94   6.91    1.67    1.665   1.67

I need to round values with 2 digits in column temp

Option 1:

round(df.temp,2)

OUTPUT:
1676725    1.66
Name: temp, dtype: float64

Option 2:

df.temp.apply(lambda x:round(x,2))

OUTPUT:
1676725    1.67
Name: temp, dtype: float64

The two round functions show different behaviors. Obviously option 1 is aligned with python 3 behavior. See Python 3.x rounding behavior

I am just wondering why option 2 behaves like that. Thanks for your help!

Cheng
  • 71
  • 8
  • 3
    See also [here](https://stackoverflow.com/questions/42813777/rounding-in-numpy/42814054). – DSM Aug 09 '18 at 18:02
  • 1
    The bit that surprises me slightly is that pulling values out of a Pandas `Series` with dtype `np.float64` gives actual Python `float` objects rather than NumPy `float64` objects. (The two round differently on Python 3 even under Python's built-in `round` function.) – Mark Dickinson Aug 09 '18 at 18:05
  • 2
    Except that if you pull the values out _directly_ via indexing, you _do_ get `np.float64` instances instead of `float` instances. It's only under `apply` that you mysteriously get regular `float`s. Gah! – Mark Dickinson Aug 09 '18 at 18:06

1 Answers1

5

I think the reason is here per numpy docs

Notes

For values exactly halfway between rounded decimal values, NumPy rounds to the nearest even value. Thus 1.5 and 2.5 round to 2.0, -0.5 and 0.5 round to 0.0, etc. Results may also be surprising due to the inexact representation of decimal fractions in the IEEE floating point standard [1] and errors introduced when scaling by powers of ten.

In Option 1, you are rounding numpy.float which is using the about rules.

In Option 2, you are rounding a python float data type docs here.

Fun with floating point arithmetic:

round(1.675, 2)  
1.68

round(2.675, 2) 
2.67
Scott Boston
  • 147,308
  • 15
  • 139
  • 187
  • 1
    It's more subtle than this. In fact, Python's `round` follows the nearest-ties-to-even rule in all cases, while NumPy's does not. But because of the usual What You See Is Not What You Get nature of binary floating-point, the value actually being rounded isn't a halfway case at all. On a typical machine, the actual value being rounded will be `1.66500000000000003552713678800500929355621337890625`, which should round up. – Mark Dickinson Aug 09 '18 at 17:52
  • @MarkDickinson Thank you for this information and insight. – Scott Boston Aug 09 '18 at 17:55
  • Yes, sorry; I'm being unnecessarily nitpicky; the difference is, exactly as you state, that we're rounding NumPy `float64` instances versus regular Python `float`s, though it's a bit of a mystery why we're getting regular Python `float`s out of a `Series` with dtype `np.float64`. – Mark Dickinson Aug 09 '18 at 18:08
  • @MarkDickinson No need to apologize, I value such insights. Thanks again. – Scott Boston Aug 09 '18 at 18:09