I am seeking a way in bash only for linux & posix environments (no gawk, no python, no libraries or language dependencies beyond default cli tools across linux/posix) method for reading a multi-line csv file into variables one line at a time for processing. The CSV values have commas inside double quotes which is screwing up the existing code:
while IFS=, read -r field1 field2 field3 field4 field5 field6 field7 field8 field9 field10 field11 field12 field13 field14 field15 field18 field17 field18 field19 field29
do
Sample CSV input from a single line:
"AC XA, S.A.","City of Commerce","00","0A348E541E6F5C258A12A5674AEF25F28BA7DCFAECEECC4EE63B71B361606AC3","Included","Included","Included","Not Included","09/30/2003","09/30/2037","","SZ=City of Commerce; FE=http://website.org; O=AC XA SA CIF A39201827; C=QR","--BINARY BLOB--maXJtYSBTQSBDSUYgQTgyNzQzMjg3MSMwIQYDVQQLExpodHRwOi8vd3d3LmNoYW1iZXJzaWduLm9yZzEiMCAGA1UEAxMZQ2hhbWJlcnMgb2YgQ29tbWVyY2UgUm9vdDAeFw0wMzA5MzAUIC0gMjAxNjESMBAGA1UEBRMJQTgyNzQzMjg3MRgwFgYDVQRhDA9WQVRFUy1BODI3NDMyODcxGzAZBgNVBAoMEkFDIENBTUVSRklSTUEgUy5BLjEnMCUGA1UEAwweR0xPQkFMIENIQU1CRVJTSUdOIFJPT1QgLSAyMDE2MB4XDTE2MDQxNDA3NTAwNloXDTQwMDQwODA3NTAwNlowggEIMQswCQYDVQQGEwJFUzEPMA0GA1UECAwGTUFEUklEMQ8wDQYDVQQHDAZNQURSSUQxOjA4BgNVBAsMMXNlZSBjdXJyZW50IGFkZHJlc3MgYXQgd3d3LmNhbWVyZmlybWEuY29tL2FkZHJlc3MxJzAlBgNVBAsMHkdMT0JBTCBDSEFNQkVSU0lHTiBST09UIC0gMjAxNjESMBAGA1UEBRMJQTgyNzQzMjg3MRgwFgYDVQRhDA9WQVRFUy1BODI3NDMyODcxGzAZBgNVBAoMEkFDIENBTUVSRklSTUEgUy5BLjEnMCUGA1UEAwweR0xPQkFMIENIQU1CRVJTSUdOIFJPT1QgLSAyMDE2MIICIjANBgkqhkiG9w0BAQEF--END--","SZ1qSRW","Email","AFX Client;Email","AFX Method;AFX Method;Entry Point;AFX Email;Time Stamping","No Entry","",""
The desired output would be reading each of the following value strings per line into variables:
"AC XA, S.A."
"City of Commerce"
"00"
"0A348E541E6F5C258A12A5674AEF25F28BA7DCFAECEECC4EE63B71B361606AC3"
"Included"
"Included"
"Included"
"Not Included"
"09/30/2003"
"09/30/2037"
""
"SZ=City of Commerce; FE=http://website.org; O=AC XA SA CIF A39201827; C=QR"
"--BINARY BLOB--maXJtYSBTQSBDSUYgQTgyNzQzMjg3MSMwIQYDVQQLExpodHRwOi8vd3d3LmNoYW1iZXJzaWduLm9yZzEiMCAGA1UEAxMZQ2hhbWJlcnMgb2YgQ29tbWVyY2UgUm9vdDAeFw0wMzA5MzAUIC0gMjAxNjESMBAGA1UEBRMJQTgyNzQzMjg3MRgwFgYDVQRhDA9WQVRFUy1BODI3NDMyODcxGzAZBgNVBAoMEkFDIENBTUVSRklSTUEgUy5BLjEnMCUGA1UEAwweR0xPQkFMIENIQU1CRVJTSUdOIFJPT1QgLSAyMDE2MB4XDTE2MDQxNDA3NTAwNloXDTQwMDQwODA3NTAwNlowggEIMQswCQYDVQQGEwJFUzEPMA0GA1UECAwGTUFEUklEMQ8wDQYDVQQHDAZNQURSSUQxOjA4BgNVBAsMMXNlZSBjdXJyZW50IGFkZHJlc3MgYXQgd3d3LmNhbWVyZmlybWEuY29tL2FkZHJlc3MxJzAlBgNVBAsMHkdMT0JBTCBDSEFNQkVSU0lHTiBST09UIC0gMjAxNjESMBAGA1UEBRMJQTgyNzQzMjg3MRgwFgYDVQRhDA9WQVRFUy1BODI3NDMyODcxGzAZBgNVBAoMEkFDIENBTUVSRklSTUEgUy5BLjEnMCUGA1UEAwweR0xPQkFMIENIQU1CRVJTSUdOIFJPT1QgLSAyMDE2MIICIjANBgkqhkiG9w0BAQEF--END--"
"SZ1qSRW"
"Email"
"AFX Client;Email"
"AFX Method;AFX Method;Entry Point;AFX Email;Time Stamping"
"No Entry"
""
""
Currently the IFS method I have breaks because of the comma within the first value "AC XA, S.A.". I've tried the following awk with some success but it isn't clear if this can be used in such a way to read each of the items into variables and specify the line number:
awk -F"," '{for(i=1;i<=NF;i++){if(i%NF==0){ORS="\n"} {if($i ~ /^\"/ || $i ~ /\"$/) {a=a OFS $i;j++;{if(j%2==0){sub(/^\,/,X,a); print a;j=0;a=""}}} else {print $i}}} {ORS=","}}' ORS="," OFS=, file.csv