trim leading and trailing whitespaces along with commas in csv file with python

Question

this is how my data looks like when opened in Microsoft Excel.

As can be seen all the contents of the cell except 218(aligned to the right of the cell) are parsed as strings(aligned to the left of the cell). It is because they start with white space(it is " 4,610" instead of "4610").

I would like to remove all those white spaces at the beginning and also replace those commas(not the ones that make csvs csvs) because if comma exists 4 and 610 may be read into different cells.

Here's what I tried:

this is what i tried with inspiration from this stackoverflow answer:

import csv
import string

with open("old_dirty_file.csv") as bad_file:
    reader = csv.reader(bad_file, delimiter=",")
    with open("new_clean_file.csv", "w", newline="") as clean_file:
        writer = csv.writer(clean_file)
        for rec in reader:
            writer.writerow(map(str.replace(__old=',', __new='').strip, rec))

But, I get this error:

Traceback (most recent call last):
  File "C:/..,,../clean_files.py", line 9, in <module>
    writer.writerow(map(str.replace(__old=',', __new='').strip, rec))
TypeError: descriptor 'replace' of 'str' object needs an argument

How do I clean those files?

You should use the pandas library, its full of functions that help you clean your data — Aven Desta, Feb 18 '21 at 11:12
oh thanks, will you be able to post an example code snippet? — Naveen Reddy Marthala, Feb 18 '21 at 11:19

score 2 · Accepted Answer · answered Feb 18 '21 at 11:55

Just need to separate replacement from stripping because python doesn't know which string the replacement should be made in.

for rec in reader:
    rec = (i.replace(__old=',', __new='') for i in rec)
    writer.writerow(map(str.strip, rec))

or combine them into a single function:

repstr = lambda string, old=',', new='': string.replace(old, new).strip()
for rec in reader:
    writer.writerow(map(repstr, rec))

trim leading and trailing whitespaces along with commas in csv file with python

Here's what I tried:

1 Answers1