I am trying to read a table from an Excel file with python3.7 in the following way (due to dependencies I cannot use another alternative to this function):
df = pd.read_excel(ruta_origen, sheet_name=sheet, thousands='.', dtype=object)
The problem comes with the numeric columns, that although I specify the parameter dtype=object, python interprets those columns as numeric and performs an internal conversion. The problem comes because these numeric columns have values in multiple formats, for example: 10000,25 and 10.000,25 . I'm trying to eliminate those problems by writing the following code:
df[col] = (
df[col].astype(str)
.apply(lambda x: x.replace('.', ''))
.apply(lambda x: x.replace(',', '.'))
.astype(float).fillna(0)
)
but when I write the contents of the table to a CSV file, the result is not as I expected, which would be without the point as thousands separator and with the comma as the decimal separator (example 10000,25).
df.to_csv('file.csv', sep='|', index=False, header=True, decimal= ',')
The above fragment of code converts the number 46420370,25 to 4642037025.0, and the number 4,197,214.82 to 4197214,82 Can anyone help me?