Change the value of specific cell in CSV using bash

Question

I want to change the value of specific cell in CSV using the bash command line.

I have a sellers.csv:

seller_id,seller_zip_code_prefix,seller_city,seller_state
3442f8959a84dea7ee197c632cb2df15,13023,campinas,SP
723a46b89fd5c3ed78ccdf039e33ac63,93310,novo hamburgo, rio grande do sul, brasil,RS

As you can see, row 3 column 3 (seller_city) is violating the rule because it contains commas. That's why MySQL is saying "Row 3 contained more data than there were input columns". I want to change novo hamburgo, rio grande do sul, brasil into novo hamburgo rio grande do sul brasil. I tried awk but its saying I am supplying wrong arguments.

awk -v r=553 -v c=3 -v val="novo hamburgo - rio grande do sul - brasil" -F sellers.csv

awk: option requires an argument -- F

What is the `-F` option for? What does the documentation say? — Cyrus, Nov 07 '20 at 16:30

Ed Morton · Answer 1 · 2020-11-07T17:08:57.433

You don't need to replace the commas, just quote your fields:

$ awk -F',' -v OFS='","' '{city=$0; sub(/([^,]*,){2}/,"",city); sub(/,[^,]*$/,"",city); print "\"" $1, $2, city, $NF "\""}' sellers.csv
"seller_id","seller_zip_code_prefix","seller_city","seller_state"
"3442f8959a84dea7ee197c632cb2df15","13023","campinas","SP"
"723a46b89fd5c3ed78ccdf039e33ac63","93310","novo hamburgo, rio grande do sul, brasil","RS"

But if you really don't want to do that you could do:

$ awk 'BEGIN{FS=OFS=","} {city=$0; sub(/([^,]*,){2}/,"",city); sub(/,[^,]*$/,"",city); gsub(/ *, */," - ",city); print $1, $2, city, $NF}' sellers.csv
seller_id,seller_zip_code_prefix,seller_city,seller_state
3442f8959a84dea7ee197c632cb2df15,13023,campinas,SP
723a46b89fd5c3ed78ccdf039e33ac63,93310,novo hamburgo - rio grande do sul - brasil,RS

For anything else, see What's the most robust way to efficiently parse CSV using awk?.

Thank you for the answer! Initially, I removed the quotes. The standard CSV place strings in " quotes in order to avoid internal ',' . That's why MySQL was reading some integer values in CSV as string. To solve this problem, I had to remove quotes from CSV. I did this using bash command: awk '{gsub(/\"/,"")};1' input.csv But it caused to another error. In row 553, there is a value for seller_city as `novo hamburgo, rio grande do sul, brasil`. That's why MySQL gave Error: "Row 553 contained more data than there were input columns". Now I should fix this problem. I'll try your advice — Abdu Malikov, Nov 07 '20 at 17:33

score 0 · Answer 2 · answered Nov 07 '20 at 16:34

0

The simplest option is to produce the csv file with | instead of , . This solution you avoid this problem.

The standard csv place strings in " quotes in order to avoid internal , .

See this question.

answered Nov 07 '20 at 16:34

Dudi Boy

4,551
1
15
30

you are absolutely right: CSV place strings in " quotes. That's why MySQL was reading some integer values in CSV as a string. To solve this problem, I had to remove quotes from CSV. I did this using bash command: awk '{gsub(/\"/,"")};1' input.csv But it caused to another error. In row 553, there is a value for seller_city as `novo hamburgo, rio grande do sul, brasil`. That's why MySQL gave Error: "Row 553 contained more data than there were input columns". To fix this problem, I had to search for the solution of replacing commas with empty string. – Abdu Malikov Nov 07 '20 at 17:31

score 0 · Answer 3 · answered Nov 08 '20 at 00:40

0

This might work for you (GNU sed):

sed ':a;s/,/&/4;T;s// -/3;ta' file

If the current line has four ,'s replace the third by - and repeat until failure.

answered Nov 08 '20 at 00:40

potong

55,640
6
51
83

Change the value of specific cell in CSV using bash

3 Answers3