I'm creating a bash script to generate some output from a CSV file (I have over 1000 entries and don't fancy doing it by hand...).
The content of the CSV file looks similar to this:
Australian Capital Territory,AU-ACT,20034,AU,Australia
Piaui,BR-PI,20100,BR,Brazil
"Adygeya, Republic",RU-AD,21250,RU,Russian Federation
I have some code that can separate the fields using the comma as delimiter, but some values actually contain commas, such as Adygeya, Republic
. These values are surrounded by quotes to indicate the characters within should be treated as part of the field, but I don't know how to parse it to take this into account.
Currently I have this loop:
while IFS=, read province provinceCode criteriaId countryCode country
do
echo "[$province] [$provinceCode] [$criteriaId] [$countryCode] [$country]"
done < $input
which produces this output for the sample data given above:
[Australian Capital Territory] [AU-ACT] [20034] [AU] [Australia]
[Piaui] [BR-PI] [20100] [BR] [Brazil]
["Adygeya] [ Republic"] [RU-AD] [21250] [RU,Russian Federation]
As you can see, the third entry is parsed incorrectly. I want it to output
[Adygeya Republic] [RU-AD] [21250] [RU] [Russian Federation]