How to using define round function like pandas round that executing one line code

Question

Goal

Only one line to execute.
I refer round function from this post. But I want using like df.round(2) which changes the affected columns but keep the sequence of data but not required selecting float or int type.
df.applymap(myfunction) will get TypeError: must be real number, not str, which means I have to select type first.

Try

I refer round source code but I could not and understand how to change my function.

score 1 · Accepted Answer · answered Jun 20 '21 at 03:29

Firstly get the columns where values are float:

cols=df.select_dtypes('float').columns

Finally:

df[cols]=df[cols].agg(round,ndigits=2)

If you want to make changes in the function then add if/else condition:

from numpy import ceil, floor


def float_round(num, places=2, direction=ceil):
    if isinstance(num,float):
        return direction(num * (10 ** places)) / float(10 ** places)
    else:
        return num

out=df.applymap(float_round)

score 0 · Answer 2 · answered Jun 20 '21 at 05:07

With the error message you mention, it's likely the column is already a string, and needs to be converted to some numeric type.

Let's now assume that the column is numeric, there are a few ways you could implement custom rounding functions that don't require reimplementing the .round() method of a dataframe object.

With the requirements you laid above, we want a way to round a data frame that:

fits on one line
doesn't require selecting numeric type

There are two ways we could do this that are functionally equivalent. One is to treat the dataframe as an argument to a function that is safe for numpy arrays.

Another is to use the apply method (explanation here) which applies a function to a row or a column.

import pandas as pd
import numpy as np

from numpy import ceil

# generate a 100x10 dataframe with a null value
data = np.random.random(1000) * 10
data = data.reshape(100,10)
data[0, 0] = np.nan
df = pd.DataFrame(data)

# changing data type of the second column
df[1] = df[1].astype(int)

# verify dtypes are different
print(df.dtypes)

# taken from other stack post
def float_round(num, places=2, direction=ceil):
    return direction(num * (10 ** places)) / float(10 ** places)

# method 1 - use the dataframe as an argument
result1 = float_round(df)
print(result1.head())

# method 2 - apply 
result2 = df.apply(float_round)
print(result2)

Because apply is applied row or column-wise, you can specify logic in your round function to ignore non-numeric columns. For instance:

# taken from other stack post
def float_round(num, places=2, direction=ceil):
    # check type of a specific column
    if num.dtype == 'O':
        return num
    return direction(num * (10 ** places)) / float(10 ** places)

# this will work, method 1 will fail
result2 = df.apply(float_round)
print(result2)

If type contains not only string but datetime etc. It will failed. I think Anurag Dabas's answer can avoid the problem. — Jack, Jun 20 '21 at 10:06

How to using define round function like pandas round that executing one line code

2 Answers2