0

I have a text file with a sample record that looks like this...

Tampa,Orlando,"Jacksonville,FL",Miami,"Tallahassee,FL"

I need to replace the embedded commas in position 3 and 5 with a space " "

Here is the awk code I have in a bash script...

AWK_script="BEGIN {
   OFS=\",\"
}
{
   for (i=1; i<=NF; i++)
   {
      if ( \$i==3 || \$i==5 )
      {
         gsub(\",\",\" \",\$i)
      }
   }
   print \$0
}
"

echo 'Tampa,Orlando,"Jacksonville,FL",Miami,"Tallahassee,FL"' | awk -vFPAT='([^,]*)|("[^"]+")' "${AWK_script}"

I'm unable to get the gsub to substitute the embedded commas to a space " ". Any help would be greatly appreciated.

  • Eliminate a whole class of errors and resist the temptation to have programming code in env variables. Put code in a file, it will simplify your problem. BUT, excellent [mcve]! Good luck. – shellter Jul 13 '23 at 16:15
  • If you think this is easier to read, here is the command line version of my original code... `$ echo 'Tampa,Orlando,"Jacksonville,FL",Miami,"Tallahassee,FL"' | awk -vFPAT='([^,]*)|("[^"]+")' 'BEGIN { OFS="," } { for (i=1; i<=NF; i++) { if ( $i==3 || $i==5 ) { gsub(","," ",$i) } } print $0 }' Tampa,Orlando,"Jacksonville,FL",Miami,"Tallahassee,FL"` – James P Jul 13 '23 at 16:55
  • 2
    `$i` is the **contents** of field number `i`. You want `if (i == 3 || i == 5)` – glenn jackman Jul 13 '23 at 17:23
  • 1
    @glennjackman OMG!!! Something so simple. Just needed another set of eyes. That was the problem. Thank you! – James P Jul 13 '23 at 17:25

1 Answers1

2

You're making things much harder for yourself by using double quotes around the awk script and storing it in a string. Strings are for storing text, functions are for storing code. Every string or script should be enclosed in single quotes unless you need double quotes and then use those unless you need no quotes. Use single quotes around it and store it in a function instead. Among other things, if you do that you can get rid of all those backslashes before the double quotes and $s inside the awk script.

I think this is what you're trying to do:

$ cat tst.sh
#!/usr/bin/env bash

deComma() {
    awk -v FPAT='([^,]*)|("([^"]|"")*")' -v OFS=',' '
        {
           for (i=3; i<=5; i+=2) {
               gsub(/,/," ",$i)
           }
           print
        }
    ' "${@:--}"
}

echo 'Tampa,Orlando,"Jacksonville,FL",Miami,"Tallahassee,FL"' | deComma

$ ./tst.sh
Tampa,Orlando,"Jacksonville FL",Miami,"Tallahassee FL"

See What's the most robust way to efficiently parse CSV using awk? for more information on parsing CSVs with awk, including why I changed your FPAT setting.

Ed Morton
  • 188,023
  • 17
  • 78
  • 185