0

I have written a small python program which writes excel data to csv, I have a few empty cells which are converting as nan in the cvs.

I have been able to convert nan to zero but my requirement is to proce empty string instead zero for nan.

I have tried to use "replace" but it isn't working.

Here my code to write the data

# Open th workbook
data = pd.read_excel(excelName, sheet_name, header=None).values


tables = dict()
for header, start_row, end_row in eval(config['headers']):
    
    table = data[start_row-1:end_row, 1:7]
    tables[header] = table
  
with open(join(CurrentDirectory, inputFile), 'w') as file:
    for header in tables:
        for row in tables[header]:
            row = [date, header] + row.tolist()
            row = [np.nan_to_num(x, 0) for x in row]
            file.write(",".join([str(x) for x in row]))
            file.write("\n")

so instead of

with open(join(CurrentDirectory, inputFile), 'w') as file:
    for header in tables:
        for row in tables[header]:
            row = [date, header] + row.tolist()
            #row = [np.nan_to_num(x, 0) for x in row]
            row = [np.nan, '' , regex=True]
            file.write(",".join([str(x) for x in row]))
            file.write("\n")

but I am getting an error, could anybody help me with what am I doing wrong?

ERROR: invalid syntax 

here is what my current csv looks like:

A, 2.4, 3212, 0, 343.2234,90
B, 1.4, 0 , 0 , 2839.09, 100

and this is how I want it

A,2.4,3212,,343.2234,90
B,1.4,,,2839.09,100
Ada_lovelace
  • 637
  • 2
  • 6
  • 18

2 Answers2

0

Instead of using np.nan_to_num(x, 0) you could use the following:

row = ['' if np.isnan(x) else x for x in row]

If you want to remove the zeros as well (as indicated by your second example) you can simply expand the condition:

row = ['' if np.isnan(x) or x == 0 else x for x in row]

See np.isnan

aasoo
  • 100
  • 1
  • 8
0

You asked two questions.

  1. How to export to csv with empty string instead of NaN is answered in the comments. Just use pandas. Answered here as well.
  2. Your invalid syntax error comes from this line
row = [np.nan, '' , regex=True]

I'm not sure what you were trying to do but you can't write regex=True to a list.

Tom Wojcik
  • 5,471
  • 4
  • 32
  • 44