I am working with a large number of CSV files, and in one of the columns, the field itself contains commas. Unfortunately, this column hasn't been enclosed in quotes, so it's causing an issue with loading the CSV files into external applications.
My CSV files look like this:
col1, col2, col3, co,,,l4, col5, col6
col1, col2, col3, co,,,,,l4, col5, col6
col1, col2, col3, co,,l4, col5, col6
I need to remove all the commas in this particular column, but I'm unsure of how to go about doing it. Unfortunately, rewriting the files with the problematic column properly enclosed in quotes isn't an option.
These problematic commas always occur between the third and second-last commas, but I don't have enough bash know-how to write a script that removes them.
Input file:
col1, col2, col3, co,,,l4, col5, col6
col1, col2, col3, co,,,,,l4, col5, col6
col1, col2, col3, co,,l4, col5, col6
Expected output:
col1, col2, col3, col4, col5, col6
col1, col2, col3, col4, col5, col6
col1, col2, col3, col4, col5, col6