I know it may sounds that there are 2000 answer to this question online but I found none for this specific case (ex. -vFPAT
of this and other answers) cause I need to be with split
. I have to split a CSV file with awk in which there may be some values inside double quotes. I need to tell the split
function to ignore ,
if inside ""
in order to get an array of the elements.
Here what I tried based on other answers as example
cat try.txt
Hi,I,"am,your",father
maybe,you,knew,it
but,"I,wanted",to,"be,sure"
cat tst.awk
BEGIN {}
{
n_a = split($0,a,/([^,]*)|("[^"]+")/);
for (i=1; i<=n_a; i++) {
collecter[NR][i]=a[i];
}
}
END {
for (i=1; i<=length(collecter); i++)
{
for (z=1; z<=length(collecter[i]);z++)
{
printf "%s\n", collecter[i][z];
}
}
}
but no luck:
awk -f tst.awk try.txt
,
,
,
,
,
,
,
,
,
I tried other regex expression based on other similar answer but none works for this particular case.
Please note: double quoted fields mat and may not be present, may be more than one, and without fixed position/length!
Thanks in advance for any help!