How to add a space after a comma if it does not exist within the 6th column in a csv file?

Question

Ubuntu 16.04
Bash 4.3.3

I also need a way to add a space after the comma if one does not exist in the 6th column. I had to comment the above line because it placed a space after all commas in the csv file.

Wrong: "This is 6th column,Hey guys,Red White & Blue,I know it,Right On"

Perfect: "This is 6th column, Hey guys, Red White & Blue, I know it, Right On"

I could almost see awk printing out the 6th column then having sed do the rest:

awk '{ print $6 }' "$feed " | sed -i 's/|/,/g; s/,/, /g; s/,\s\+/, /g'

This is what I have so far:

for feed in *; do
   sed -r -i 's/([^,]{0,10})[^,]*/\1/5' "$feed"
   sed -i '
      s/<b>//g; s/*//g;
      s/\([0-9]\)""/\1inch/g;
#     s/|/,/g; s/,/, /g; s/,\s\+/, /g;
      s/"one","drive"/"onetext","drive"/;
      s/"comments"/"description"/;
      s/"features"/"optiontext"/;
    ' "$feed"
done

s/|/,/g; s/,/, /g; s/,\s\+/, /g; works but is global and not within a column.

You want to add a space after a comma if it doesn't exists? The comma has to exists to add a space after it. How does the input file look like? How is it separated? By spaces, so awk can parse it? What is the expected output? @edit och I get it, you want to add a space after a comma it the space does not exists, sry. — KamilCuk, Sep 09 '18 at 20:18
Why not `awk '{ print $6 }' "$feed " | sed 's/, */, /g'`? What's the `|` doing? And all the other substitutes? — Ljm Dullaart, Sep 09 '18 at 21:25

Ed Morton · Accepted Answer · 2018-09-09T23:17:47.430

2

It sounds like all you need is this (using GNU awk for FPAT):

awk 'BEGIN{FPAT="[^,]*|\"[^\"]+\""; OFS=","} {gsub(/, ?/,", ",$6)} 1'

e.g.:

$ cat file
1,2,3,4,5,"This is 6th column,Hey guys,Red White & Blue,I know it,Right On",7,8

$ awk 'BEGIN{FPAT="[^,]*|\"[^\"]+\""; OFS=","} {gsub(/, ?/,", ",$6)} 1' file
1,2,3,4,5,"This is 6th column, Hey guys, Red White & Blue, I know it, Right On",7,8

It actually looks like your whole shell script including multiple calls to GNU sed could be done far more efficiently in just one call to GNU awk with no need for a surrounding shell loop, e.g. (untested):

awk -i inplace '
BEGIN{FPAT="[^,]*|\"[^\"]+\""; OFS=","} 
{
    $0 = gensub(/([^,]{0,10})[^,]*/,"\\1",5)
    $0 = gensub(/([0-9])""/,"\\1inch","g")
    sub(/"one","drive"/,"\"onetext\",\"drive\"")
    sub(/"comments"/,"\"description\"")
    sub(/"features"/,"\"optiontext\"")
    gsub(/, ?/,", ",$6)
}
' *

edited Sep 09 '18 at 23:17

answered Sep 09 '18 at 23:07

Ed Morton

188,023
17
78
185

but, if input contains blank fields `FPAT` will fail right ? – oguz ismail Sep 10 '18 at 05:13
1

@oguzismail. No. The first part of the `FPAT` allows empty fields. – kvantour Sep 10 '18 at 09:06
1

@ed, This worked well with what you have. I'm going to implement it tomorrow. – Vituvo Sep 17 '18 at 01:42

score 0 · Answer 2 · answered Sep 10 '18 at 06:45

This might work for you (GNU sed):

sed -r 's/[^,"]*("[^"]*")*/\n&\n/6;h;s/, ?/, /g;G;s/.*\n(.*)\n.*\n(.*)\n.*\n/\2\1/' file

Surround the 6th field by newlines. Make a copy of the line. Replace all commas followed by a possible space with a comma followed by a space. Append the original line and using pattern matching replace the amended field discarding the rest of the ameliorated line.

How to add a space after a comma if it does not exist within the 6th column in a csv file?

2 Answers2