0

My sample data is

cat > myfile
"a12","b112122","c12,d12"
a13,887988,c13,d13
a14,b14121,c79,d13

when I try to remove " from colum 2 by

awk -F, 'BEGIN { OFS = FS } $2 ~ /"/ { sub(/"/, "", $2) }1' myfile 
"a12",b112122","c12,d12"
a13,887988,c13,d13
a14,b14121,c79,d13

It only remove only 1 comma, instead of b112122 i am getting b112122"

how to remove all " in 2nd column

Pro Coder
  • 71
  • 6
  • Why about your last question? You haven't accepted it but yet moved on to a different one – Inian Dec 24 '19 at 08:21
  • Please give me sometime... Just sorting out full question. I notice once I click on vote, dosnt matter if I ask further there for same issue no one reply. so didnt vote yet – Pro Coder Dec 24 '19 at 09:10
  • @WDC, Your thread title says `,` comma but letter which you are trying to remove is NOT `,`(comma) it is double quote`"` just letting you know here. – RavinderSingh13 Dec 24 '19 at 09:58
  • @RavinderSingh13 Its not "How to remove all double quotes by using sub() function of awk in a specific field" its . "How to remove double quotes in a specific column by using sub() function of awk in a specific field" – Pro Coder Dec 24 '19 at 10:17
  • @WDC, Sure, please feel free to edit your question, cheers. – RavinderSingh13 Dec 24 '19 at 10:18
  • @WDC, I completely understood meaning of question + I given answer also(which I deleted later). I was trying to fix to make it more clear simple. – RavinderSingh13 Dec 24 '19 at 10:21
  • Since you edit the title to not specify what column to change, you should look at what will work for all column. The solution you have accepted fails if there are `,` within second column. – Jotne Dec 24 '19 at 10:48
  • 1
    But the answer you got will fail in many situation and should not be accepted. Also calling people name here is not needed. We are here to help, and Ravider has been here for years helping out. Mary Xmas.... – Jotne Dec 24 '19 at 10:52

2 Answers2

0

From the documentation:

Search target, which is treated as a string, for the leftmost, longest substring matched by the regular expression regexp.[...] Return the number of substitutions made (zero or one).

It is quite clear that the function sub is doing at most one single replacement and does not replace all occurences.

Instead, use gsub:

Search target for all of the longest, leftmost, nonoverlapping matching substrings it can find and replace them with replacement. The ‘g’ in gsub() stands for “global,” which means replace everywhere.

So you can add a 'g' to your line and it works fine:

awk -F, 'BEGIN { OFS = FS } $2 ~ /"/ { gsub(/"/, "", $2) }1' myfile 
Bentoy13
  • 4,886
  • 1
  • 20
  • 33
  • Try to change to column number 3 and this solution fails. – Jotne Dec 24 '19 at 10:15
  • @Jotne Could you please explain what is failing? It removes the beginning quote mark. If you are referring to the fact that columns 3 and 4 should not be split, it's another matter. I would use this [answer](https://stackoverflow.com/a/43213844/1507014) in this case. – Bentoy13 Dec 24 '19 at 11:20
  • @Jotne I am doing testing in 196 different files, 1st 2 files work for me till now... all 196 files have different structure ... 1st i need to remove comma then double quote and need to remove + . and space in 19th column .. and so on... so far-reaching goal... but I guess if 2-3 people say its not good answer.. please wait so I can update .. i cant be quick.. i am not a pro yet – Pro Coder Dec 24 '19 at 11:56
  • @wdc It works on the second file and gives the result like this `"a12",b112122,"c12,d12"` but if you try column #3, you get `"a12","b112122",c12,d12"`, it should be `"a12","b112122",c12,d12` but then you should also change the field separator to some else than `,`. Se my second example. – Jotne Dec 24 '19 at 18:40
0

When you dealing with CSV file, not using FPAT, it will break sooner or later.
Here is a gnu awk that does the jib.

awk -v OFS="," -v FPAT="([^,]+)|(\"[^\"]+\")" '{gsub(/"/,"",$2)}1' file
"a12",b112122,"c12,d12"
a13,887988,c13,d13
a14,b14121,c79,d13

It will work fine on any column, number 3 as well.

Example on remove " on column 3 at the same time change separator to |

awk -v OFS="|" -v FPAT="([^,]+)|(\"[^\"]+\")" '{gsub(/"/,"",$3);$1=$1}1' file
"a12"|"b112122"|c12,d12
a13|887988|c13|d13
a14|b14121|c79|d13
Jotne
  • 40,548
  • 12
  • 51
  • 55