General problem :
I have two similar data frames (same shape, same variables but different values).
How to applymap
a function on each cell of df1, according the value of the same cell from df2.
My specific problem :
How to applymap(round())
function on each cell of df1, according the decimal number of that cell from df2.
I did this with a for loop across the columns of my dataframe.
I now want to optimize the code using df.applymap()
or df.apply(np.vectorized())
function to avoid the loop.
optional : I also want to want to shuffle this round decimal number by variable.
The code bellow works properly but need to be optimized.
import numpy as np
import pandas as pd
# Count decimal number
def decimal_count(number):
f = str(number)
if '.' in f:
digits = f[::-1].find('.')
else : digits = 0
return digits
# dataframe I want to round
df_to_round = pd.DataFrame({'Integers' :[1, 2.3, 4.1, 4, 5],
'Float' :[1.1, 2.2, 3.5444, 4.433 ,5.5555]})
# dataframe with correct decimal number
df_rounded = pd.DataFrame({'Integers' :[1, 2, 3, 4, 5],
'Float' :[1.1, 6.233, 3.34, 4.46 ,5.777]})
# round to the right decimal
for column in inverse_quanti.columns:
# get decimal
df_to_round['decimals'] = df_rounded[column].apply(decimal_count)
# shuffle decimal level
# only if needed
# df_to_round['decimals'] = np.random.permutation(df_to_round['decimals'].values)
# Apply round function to df_to_round
df_to_round[column] = df_to_round[[column, 'decimals']].apply(lambda x : round(x[column], int(x['decimals'])), axis= 1)
df_to_round.drop(['decimals'], axis = 1, inplace = True)
My main obstacle is how to adapt the # Apply round function to df_to_round
step to vectorized method.