Replace text in quotes with Regex or AWK

Question

I have a CSV file like this:

Name,Age,Pos,Country
John,23,GK,Spain
Jack,30,"LM, MC, ST",Brazil
Luke,21,"CMD, CD",England

And I need to get this:

Name,Age,Pos,Country
John,23,GK,Spain
Jack,30,LM,Brazil
Luke,21,CMD,England

With this expression I can extract the field but I don't know how to update it in the dataset

grep -o '\(".*"\)' file.csv | cut -d "," -f | sed 's/"//'

On SO we do encourage users to add their efforts which they put in order to solve their own problems, so please do add the same and let us know(not my down-vote BTW). — RavinderSingh13, May 26 '20 at 09:52
Does this answer your question? [What's the most robust way to efficiently parse CSV using awk?](https://stackoverflow.com/questions/45420535/whats-the-most-robust-way-to-efficiently-parse-csv-using-awk) — kvantour, May 26 '20 at 12:06

score 3 · Accepted Answer · answered May 26 '20 at 09:58

$ sed -E 's/"([^,]+)[^"]*"/\1/' ip.txt
John,23,GK,Spain
Jack,30,LM,Brazil
Luke,21,CMD,England

-E to enable ERE
" match double quote
([^,]+) match non-comma characters and capture it for reuse in replacement section
[^"]*" any other remaining characters
\1 will refer to the text that was captured with ([^,]+)

Note that this will work only one double quoted field and won't work if there are other valid csv formats like escaped double quotes, newline character in field, etc

RavinderSingh13 · Answer 2 · 2020-05-26T10:24:34.757

1

Could you please try following, this should cover case when you have more than 1 occurrence of "....." in your Input_file, written and tested with GNU awk.

awk -v FPAT='[^"]*|"[^"]+"' '
BEGIN{
  OFS=""
}
{
  for(i=1;i<=NF;i++){
    if($i~/^".*"$/){
      gsub(/^"|"$|[, ].*/,"",$i)
    }
  }
}
1
'  Input_file

edited May 26 '20 at 10:24

answered May 26 '20 at 10:19

RavinderSingh13

130,504
14
57
93

Replace text in quotes with Regex or AWK

2 Answers2