1

I have a csv file. There are fields within this csv file containing quotes. And within these quotes there could be commas available, this is not always the case. If I extract a column using:

awk -F, '{ print $3 }' testfile.csv

This won't work. So my idea is to change the , within the quoted field to _ and 'escape' the problem.

This is the contents of my file:

chromosome,position,marker,sample1,sample2
chr1,100,NA,A,C
chr1,200,"test1,test2",A,C
chr1,300,NA,A,C
chr1,400,"test6",A,C
chr1,500,NA,A,C
chr1,600,"test3,test4,test5",A,C

My desired contents would be:

chromosome,position,marker,sample1,sample2
chr1,100,NA,A,C
chr1,200,"test1_test2",A,C
chr1,300,NA,A,C
chr1,400,"test6",A,C
chr1,500,NA,A,C
chr1,600,"test3_test4_test5",A,C

Any suggestions?

Inian
  • 80,270
  • 14
  • 142
  • 161

0 Answers0