I'd like to use dtype='float32'
(it is probably a numpy dtype => np.float32
) instead of dtype='float64'
to reduce memory usage of my pandas dataframe, because I have to handle hugh pandas dataframes.
At one point, I'd like to extract a python list with '.to_dict(orient='records')'
in order to get a dictionary for each row.
In this case, I will get additional decimal places, which are probably based on s.th like this:
Is floating point math broken?
How can I cast the date / change the type etc. in order to get the same result, as I get with float64
(see example snippets)?
import pandas as pd
_data = {'col1': [1.45123, 1.64123], 'col2': [0.1, 0.2]}
_test = pd.DataFrame(_data).astype(dtype='float64')
print(f"{_test=}")
print(f"{_test.round(1)=}")
print(f"{_test.to_dict(orient='records')=}")
print(f"{_test.round(1).to_dict(orient='records')=}")
float64
output:
_test= col1 col2
0 1.45123 0.1
1 1.64123 0.2
_test.round(1)= col1 col2
0 1.5 0.1
1 1.6 0.2
_test.to_dict(orient='records')=[{'col1': 1.45123, 'col2': 0.1}, {'col1': 1.64123, 'col2': 0.2}]
_test.round(1).to_dict(orient='records')=[{'col1': 1.5, 'col2': 0.1}, {'col1': 1.6, 'col2': 0.2}]
import pandas as pd
_data = {'col1': [1.45123, 1.64123], 'col2': [0.1, 0.2]}
_test = pd.DataFrame(_data).astype(dtype='float32')
print(f"{_test=}")
print(f"{_test.round(1)=}")
print(f"{_test.to_dict(orient='records')=}")
print(f"{_test.round(1).to_dict(orient='records')=}")
float32
output:
_test= col1 col2
0 1.45123 0.1
1 1.64123 0.2
_test.round(1)= col1 col2
0 1.5 0.1
1 1.6 0.2
_test.to_dict(orient='records')=[{'col1': 1.4512300491333008, 'col2': 0.10000000149011612}, {'col1': 1.6412299871444702, 'col2': 0.20000000298023224}]
_test.round(1).to_dict(orient='records')=[{'col1': 1.5, 'col2': 0.10000000149011612}, {'col1': 1.600000023841858, 'col2': 0.20000000298023224}]