Python script to import a comma separated csv that has fixed length fields

Question

I have a .csv file with comma separated fields. I am receiving this file from a 3rd party and the content cannot change. I need to import the file to a database, but there are commas in some of the "comma" separated fields. The comma separated fields are also fixed length - when I stright up print the fields as per the below lines in function insert_line_csv they are spaced in a fixed length.

I need essentially need an efficient method of collecting fields that could have comma's included in the field. I was hoping to combine the two methods. Not sure if that would be efficient.

I am using python 3 - willing to use any libraries to make the job efficient and easy.

Currently I am have the following:

with open(FileName, 'r') as f:
    for count, line in enumerate(f):
        insert_line_csv(count, line)

with the insert_line_csv function looking like:

def insert_line_csv(line_no, line):
    line = line.split(",")
    field0 = line[0]
    field1 = line[1]
    ......

I am importing the line_no, as that is also being entered into the db.

Any insight would be appreciated.

A sample dataset:

text    ,2000.00   ,2018-07-07,textwithoutcomma      ,text     ,1 
text    ,3000.00   ,2018-07-08,textwith,comma        ,text     ,7 
text    ,1000.00   ,2018-07-07,textwithoutcomma      ,text     ,4

from your explanations it looks like it's fixed-width, no comma-separated file — buran, Sep 09 '19 at 10:39
This may help: https://stackoverflow.com/questions/8311900/read-csv-file-with-comma-within-fields-in-python — DarrylG, Sep 09 '19 at 10:40
provided a sample data set. seems like this could be solved by going exclusively in a field width direction. Which will be a pain to realise. — Kevin Smith, Sep 09 '19 at 11:30

Theo Freund · Answer 1 · 2019-09-10T07:03:30.417

If the comma seperated fields are all fixed length, you should be able to just splice them off by count instead of splicing by commas, see Split string by count of characters

as a mockup-code you have

toParse = line

while (toParse != "")
    chunk = first X chars of toParse
    restOfLine = toParse without the chars just cut off
    write chunk to db
    toParse = restOfLine

That should work imho

Edit: upon seeing your sample dataset. Can there only be one field with commas inside of it? If so, you could split via comma, read out the first 3 fields, then the last two. Whatever is left, you concatenate again, because it is the value fo the 4th field. (If it had commas, ou'll need to actually concatenate there, if not, its already the value)

Python script to import a comma separated csv that has fixed length fields

1 Answers1