0

Im currently working on a script to import data from a .csv file containing financial accounting information. This .csv could contain either a number format like this: "100.000" or like this "100.000,00". Because the file can be pretty big, i'd like to avoid iterations. That works quite well for the first format "100.000" if i import it with an dictionary an setting 'amount':float. But i run into trouble with the second format. Is there a way to create a specific format in the dictionary?

import pandas as pd
datatypes={'Customer Number':int,'Account Number':int,'JE Entry':float}
Stack = 'stackUTF.csv'
delimiter = '|'

try:
    dfStack = pd.read_csv(Stack, sep=delimiter, decimal=',', encoding='utf-8', dtype=datatypes)
except:
    print("not known format recognized")

.csv with one point format:

Customer Number|Account Number|JE Entry|

123456|123123123|100.000

.csv german format

Customer Number|Account Number|JE Entry|

123456|123123123|100.000,01

Community
  • 1
  • 1
JDMGM
  • 1
  • 1
  • Can you share some code as well as sample input csv data – Alderven Feb 21 '19 at 09:53
  • You need to be careful with interpreting "international" number formats. `100.000` will evaluate to `100` because the dot is read as a decimal point, not a thousands separator, and `100.000,00` will cause errors. See https://stackoverflow.com/questions/48843193/convert-a-number-using-atof/48845430#48845430 for ideas how to handle those. Read the number from the CSV file as a string and convert it later. – Tim Pietzcker Feb 21 '19 at 10:20

1 Answers1

0

A way to format amount's numbers like this:

num_str = '100.000,01'
num_str = num_str.replace('.', '')
num_str = num_str.replace(',', '.')

numer = float(num_str)

To revert that format do something like this:

num_str = '{:,}'.format(int(number)).replace(',', '.') + '{:,.2f}'.format(number- int(number))[1:].replace('.', ',')

Thanks for your issue!