pandas python - round() not behaving correctly

Question

I'm rounding values in a dataframe to 1 decimal place.

Here is the df

                                            Våren 2015  Hösten 2014  Våren 2014
Question                                                                      
1) Maten är vällagad och smakar bra          4.000000     3.469136    3.678571
Δ 2) Maten ser aptitlig ut                   3.883721     3.296296    3.592593
3) Det är en bra variation på grönsakerna    3.365854     2.901235    3.333333
Δ 4) Maten är bra varierad och passar mig    3.725000     3.365854    3.607143
5) Portionsstorleken är lagom                4.166667     3.875000    4.071429
Δ 6) Konsistensen på maten är bra            4.000000     3.468354    3.607143
7) Info om matens innehåll är tydlig         3.950000     3.454545    3.821429
8) Maten levereras i en bra förpackning      3.880952     3.987179    4.214286
9) Jag får den mat jag har beställt          4.166667     4.194805    4.481481

my code:

df.applymap(lambda x: round(x,1))

Output

                                            Våren 2015  Hösten 2014  Våren 2014
Question                                                                      
1) Maten är vällagad och smakar bra               4.0          3.5         3.7
Δ 2) Maten ser aptitlig ut                        3.9          3.3         3.6
3) Det är en bra variation på grönsakerna         3.4          2.9         3.3
Δ 4) Maten är bra varierad och passar mig         3.7          3.4         3.6
5) Portionsstorleken är lagom                     4.2          3.9         4.1
Δ 6) Konsistensen på maten är bra                 4.0          3.5         3.6
7) Info om matens innehåll är tydlig              3.9          3.5         3.8
8) Maten levereras i en bra förpackning           3.9          4.0         4.2
9) Jag får den mat jag har beställt               4.2          4.2         4.5

Code above incorrectly rounds '3.95' in column 'Varen 2015' to 3.9 instead of 4.0.

Note: If I insert the number directly to the function like so, it returns the correct value...

round(3.95,1)

output

4.0

FYI - im using python version 2.7.9

Can u show me the whole result. It's worked fine in loop for me. — Rahul K P, Jul 03 '15 at 12:24
Gathering from the other numbers, it may be that 3.95 is just a representation of a number like 3.94999999, rounded by the numy/panda internal string representation. When you enter 3.95 by hand, instead, that may be a slightly different number. Turn up the printing precision for numpy/pandas and see if the 3.95 in `data` is still 3.95. — , Jul 03 '15 at 12:35
While I can't reproduce your problem, I managed to get the inverse problem: 3.4999999 gets rounded to 4.0: `data = np.array([1e8/25316455.75]); print(data); print(np.round(data)) \n [ 3.94999999] \n [ 4.]` — , Jul 03 '15 at 12:47
It's just floating point behaviour. And if that's what it rounds to, you should just go with it, or round it to more decimals if you feel your presentation is losing precision. — , Jul 03 '15 at 13:08
Did you get this dataframe by reading a CSV file, by any chance? The Pandas `read_csv` function unfortunately doesn't do correct rounding when converting the numeric strings in the CSV file to Python floats. See https://github.com/pydata/pandas/issues/8002 — Mark Dickinson, Jul 03 '15 at 16:08

Ami Tavory · Accepted Answer · 2015-07-03T13:09:17.230

It's a bit hard to answer, as what you list is not a DataFrame, not a Python list of lists, etc.

However, you should note that there is probably no reason to do this in a loop, as it can be done vectorially (and correctly):

import numpy as np

data = [[ 4., 3.4691358, 3.67857143],
    [ 3.88372093, 3.2962963, 3.59259259],
    [ 3.36585366, 2.90123457, 3.33333333],
    [ 3.725, 3.36585366, 3.60714286],
    [ 4.16666667, 3.875, 4.07142857],
    [ 4., 3.46835443, 3.60714286],
    [ 3.95, 3.45454545, 3.82142857],
    [ 3.88095238, 3.98717949, 4.21428571],
    [ 4.16666667, 4.19480519, 4.48148148]]

>> np.array(data).round(1)
array([[ 4. ,  3.5,  3.7],
   [ 3.9,  3.3,  3.6],
   [ 3.4,  2.9,  3.3],
   [ 3.7,  3.4,  3.6],
   [ 4.2,  3.9,  4.1],
   [ 4. ,  3.5,  3.6],
   [ 4. ,  3.5,  3.8],
   [ 3.9,  4. ,  4.2],
   [ 4.2,  4.2,  4.5]])

Edit Following the update to your question, I suspect something else. Many floating point numbers cannot really be displayed in a finite number of decimals.

Try running

df['Våren 2015'] < 3.95

or

df['Våren 2015'] - 3.95

I suspect the display is misleading you.

I thought it would be easier to replicate being list of lists but you are right. Ive updated my comments. — Boosted_d16, Jul 03 '15 at 12:37

score 1 · Answer 2 · answered Jul 03 '15 at 12:28

You mentioned you're using a pandas dataframe. I'm not being able to reproduce the behaviour you're seeing:

In [29]: data
Out[29]:
         c1        c2        c3
0  4.000000  3.469136  3.678571
1  3.883721  3.296296  3.592593
2  3.365854  2.901235  3.333333
3  3.725000  3.365854  3.607143
4  4.166667  3.875000  4.071429
5  4.000000  3.468354  3.607143
6  3.950000  3.454545  3.821429
7  3.880952  3.987179  4.214286
8  4.166667  4.194805  4.481481

In [30]: data.__class__
Out[30]: pandas.core.frame.DataFrame

In [31]: for index, row in data.iterrows():
             for cell in row:
                 print(str(cell) + ': ' + str(round(cell,1)))
   ....:
4.0: 4.0
3.4691358: 3.5
3.67857143: 3.7
3.88372093: 3.9
3.2962963: 3.3
3.59259259: 3.6
3.36585366: 3.4
2.90123457: 2.9
3.33333333: 3.3
3.725: 3.7
3.36585366: 3.4
3.60714286: 3.6
4.16666667: 4.2
3.875: 3.9
4.07142857: 4.1
4.0: 4.0
3.46835443: 3.5
3.60714286: 3.6
3.95: 4.0
3.45454545: 3.5
3.82142857: 3.8
3.88095238: 3.9
3.98717949: 4.0
4.21428571: 4.2
4.16666667: 4.2
4.19480519: 4.2
4.48148148: 4.5

As Ami correctly pointed out, there's no need to iterate over the matrix, the benefit of using numpy is to apply a single operation to a whole series of items.

pandas python - round() not behaving correctly

2 Answers2

Linked