Pandas read_csv create columns based on maxmimum number of delimiters in a row

Question

Lets say I have a (badly formed) csv like this:

header1, header2, header3
value1, value2, value3, value4

I'd like to load this into a dataframe. However

pd.read_csv(file_data, index = False)

drops value4:

header1	header2	header3
value1	value2	value3

and

pd.read_csv(file_data)

leaves me with no way to differentiate if the value of the index came from value1 in the csv file, or was autoassigned by pandas.

Is there a way to have pandas just create dummy columns on the end based on the row with the maximum number of delimiters?

Does this answer your question? [import csv with different number of columns per row using Pandas](https://stackoverflow.com/questions/27020216/import-csv-with-different-number-of-columns-per-row-using-pandas) — Dušan Maďar, May 06 '21 at 17:49
@Gusti Adli's response below seemed the most elegant, as at least several of the top comments on that thread imply that the user knows the maximum number of columns beforehand. — Maile Cupo, May 06 '21 at 18:59

score 1 · Accepted Answer · answered May 06 '21 at 17:55

If the problem is only the header, setting the header parameter as None will solve your problem.

pd.read_csv(file_data, header=None)

If the number of delimiters on each row is different, you need to read each line using open() function.

with open('test.csv', 'r') as f:
    df = [i.strip().split(',') for i in f.readlines()]
    
df = pd.DataFrame(df)
print(df)

Output: (I added "1,2,3,4,5,6\n" and "11,22,33\n" after the last row)

         0        1        2       3     4     5
0  header1  header2  header3          None  None
1   value1   value2   value3  value4  None  None
2        1        2        3       4     5     6
3       11       12       13    None  None  None

Pandas read_csv create columns based on maxmimum number of delimiters in a row

1 Answers1