13

I've searched the pandas documentation and cookbook recipes and it's clear you can round to the nearest decimal place easily using dataframe.columnName.round(decimalplace).

How do you do this with larger numbers?

Example, I have a column of housing prices and I want them rounded to the nearest 10000 or 1000 or whatever.

df.SalesPrice.WhatDoIDo(1000)? 
Alan Kavanagh
  • 9,425
  • 7
  • 41
  • 65
Angelo
  • 954
  • 2
  • 8
  • 15
  • 2
    Possible duplicate of [How do I round to the nearest ten?](https://stackoverflow.com/questions/39824914/how-do-i-round-to-the-nearest-ten) – Andy Ray Dec 23 '17 at 01:37
  • 1
    Divide by 1000, round, multiply by 1000 – Barmar Dec 23 '17 at 01:39
  • Lots of languages have lots of iterative ways to solve this. While there are no shortage of "how to round" questions on SO, I was looking for a specific pandas way to leverage the efficiency of this framework. – Angelo Dec 23 '17 at 13:47

5 Answers5

16

By using the notation df.ColumnName.round(), you are actually calling pandas.Series.round, the documentation of which specifies:

decimals : int

Number of decimal places to round to (default: 0). If decimals is negative, it specifies the number of positions to the left of the decimal point.

So you can do:

df = pd.DataFrame({'val':[1,11,130,670]})
df.val.round(decimals=-2)

This produces the output:

0      0
1      0
2    100
3    700
Name: val, dtype: int64

decimals=-3 rounds to the 1000s, and so on. Notably, it also works using pandas.DataFrame.round(), though the documentation doesn't tell you:

df = pd.DataFrame({'val':[1,11,130,670], 'x':[1,11,150,900]})
df.round({'val':-2})

This will round the column val to the nearest 100, but leave x alone.

Community
  • 1
  • 1
PaSTE
  • 4,050
  • 18
  • 26
3

Function round does accept negative values for cases in which you want to specify precision to the left of the decimal point:

dataframe.columnName.round(-3)

Example:

>>> pd.Series([1, 500, 500.1, 999, 1500, 1501, 946546]).round(-3)
0         0.0
1         0.0
2      1000.0
3      1000.0
4      2000.0
5      2000.0
6    947000.0
dtype: float64
grovina
  • 2,999
  • 19
  • 25
1

You can try this

df = pd.DataFrame({'val':[1,11,130,670]})
10**df.val.astype(str).str.len()
Out[27]: 
0      10
1     100
2    1000
3    1000
Name: val, dtype: int64
BENY
  • 317,841
  • 20
  • 164
  • 234
1

Another interesting "hack" is this: Let's say you want to round off to the nearest 100s. You can add 50, then divide by 100, convert to integer, multiply back by 100.

df = pd.DataFrame({'val':[1005,1299,1301,4109]})
df.val.round(-2) # Proper way
((df.val+50)/100).astype(int)*100 # Hack

Gives you this, as desired:

[1000, 1300, 1300, 4100]
FatihAkici
  • 4,679
  • 2
  • 31
  • 48
0

My favorite, dynamic way to do this:

ds: pd.Series to "round"
x: int/float of the power to round

# Define rounding lambda function:
my_rounder = lambda ds, x: ((ds + 0.5*10**x) // 10**x) * 10**x

# Apply lambda function to "prices" values:
housing_df["rounded_prices"] = my_rounder(housing_df["prices"], 3)

# If you need to force/ensure no decimal:
housing_df["rounded_prices"] = housing_df["rounded_prices"].apply(int)

Alternative floor rounder:

my_floor_rounder = lambda ds, x: (ds // 10**x) * 10**x

Breakdown:

print(housing_df["prices"].head())

year
2010    372560.0
2011    374507.0
2012    376454.0
2013    378401.0
2014    380348.0
Name: prices, dtype: float64
    
# This step can be omitted if you're finding the floor:
step_up = housing_df["prices"] + 0.5*10**3
print(step_up.head())

year
2010    373060.0
2011    375007.0
2012    376954.0
2013    378901.0
2014    380848.0
Name: prices, dtype: float64

thsnd = step_up // 10**3
print(thsnd.head())

year
2010    373.0
2011    375.0
2012    376.0
2013    378.0
2014    380.0
Name: prices, dtype: float64

rounded = thsnd * 10**3
print(rounded.head())

year
2010    373000.0
2011    375000.0
2012    376000.0
2013    378000.0
2014    380000.0
Name: prices, dtype: float64

int_rounded = rounded.apply(int)
print(int_rounded.head())

year
2010    373000
2011    375000
2012    376000
2013    378000
2014    380000
Name: prices, dtype: int64
JGarcia
  • 57
  • 5