0

Ubuntu 16.04
Bash 4.3.3

I also need a way to add a space after the comma if one does not exist in the 6th column. I had to comment the above line because it placed a space after all commas in the csv file.

Wrong: "This is 6th column,Hey guys,Red White & Blue,I know it,Right On"

Perfect: "This is 6th column, Hey guys, Red White & Blue, I know it, Right On"

I could almost see awk printing out the 6th column then having sed do the rest:

awk '{ print $6 }' "$feed " | sed -i 's/|/,/g; s/,/, /g; s/,\s\+/, /g'

This is what I have so far:

for feed in *; do
   sed -r -i 's/([^,]{0,10})[^,]*/\1/5' "$feed"
   sed -i '
      s/<b>//g; s/*//g;
      s/\([0-9]\)""/\1inch/g;
#     s/|/,/g; s/,/, /g; s/,\s\+/, /g;
      s/"one","drive"/"onetext","drive"/;
      s/"comments"/"description"/;
      s/"features"/"optiontext"/;
    ' "$feed"
done

s/|/,/g; s/,/, /g; s/,\s\+/, /g; works but is global and not within a column.

Vituvo
  • 1,008
  • 1
  • 9
  • 29
  • You want to add a space after a comma if it doesn't exists? The comma has to exists to add a space after it. How does the input file look like? How is it separated? By spaces, so awk can parse it? What is the expected output? @edit och I get it, you want to add a space after a comma it the space does not exists, sry. – KamilCuk Sep 09 '18 at 20:18
  • Why not `awk '{ print $6 }' "$feed " | sed 's/, */, /g'`? What's the `|` doing? And all the other substitutes? – Ljm Dullaart Sep 09 '18 at 21:25

2 Answers2

2

It sounds like all you need is this (using GNU awk for FPAT):

awk 'BEGIN{FPAT="[^,]*|\"[^\"]+\""; OFS=","} {gsub(/, ?/,", ",$6)} 1'

e.g.:

$ cat file
1,2,3,4,5,"This is 6th column,Hey guys,Red White & Blue,I know it,Right On",7,8

$ awk 'BEGIN{FPAT="[^,]*|\"[^\"]+\""; OFS=","} {gsub(/, ?/,", ",$6)} 1' file
1,2,3,4,5,"This is 6th column, Hey guys, Red White & Blue, I know it, Right On",7,8

It actually looks like your whole shell script including multiple calls to GNU sed could be done far more efficiently in just one call to GNU awk with no need for a surrounding shell loop, e.g. (untested):

awk -i inplace '
BEGIN{FPAT="[^,]*|\"[^\"]+\""; OFS=","} 
{
    $0 = gensub(/([^,]{0,10})[^,]*/,"\\1",5)
    $0 = gensub(/([0-9])""/,"\\1inch","g")
    sub(/"one","drive"/,"\"onetext\",\"drive\"")
    sub(/"comments"/,"\"description\"")
    sub(/"features"/,"\"optiontext\"")
    gsub(/, ?/,", ",$6)
}
' *
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
0

This might work for you (GNU sed):

sed -r 's/[^,"]*("[^"]*")*/\n&\n/6;h;s/, ?/, /g;G;s/.*\n(.*)\n.*\n(.*)\n.*\n/\2\1/' file

Surround the 6th field by newlines. Make a copy of the line. Replace all commas followed by a possible space with a comma followed by a space. Append the original line and using pattern matching replace the amended field discarding the rest of the ameliorated line.

potong
  • 55,640
  • 6
  • 51
  • 83