Search for a pattern in Column in a CSV and replace another pattern in the same line using sed command

Question

I want to check for a pattern (only if the pattern starts with) in second column in a CSV file and if that pattern exists then replace something else in same line.

I wrote the following sed command for following csv to change the I to N if the pattern 676 exists in second column. But it checks 676 in the 7th and 9th column also since the ,676 exists. Ideally, I want only the second line to be checked for if the prefix 676 exists. All I want is to check 676 prefixed in second column (pattern not in the middle or end of the second value Ex- 46769777) and then do the change on ,I, to ,N,.

sed -i  '/,676/ {; s/,I,/,N,/;}' temp.csc

6768880,55999777,S,I,TTTT,I,67677,yy
6768880,676999777,S,I,TTTT,I,67677,yy 
6768880,46769777,S,I,TTTT,I,67677,yy

Expected result required

6768880,55999777,S,I,TTTT,I,67677,yy
6768880,676999777,S,N,TTTT,N,67677,yy
6768880,40999777,S,I,TTTT,I,67677,yy

difficult to write supportable code in `sed` that can do this. Do you really care if it is `sed`? `awk` is designed with these sort of problems in mind and will be very easy to implement. Good luck. — shellter, Apr 09 '15 at 02:29
You have an error in your output. How does `46769777` become `40999777` — Jotne, Apr 09 '15 at 05:22

John1024 · Answer 1 · 2015-04-09T04:03:18.260

This requires that 676 appear at the beginning of the second column before any changes are made:

$ sed   '/^[^,]*,676/ s/,I,/,N,/g' file
6768880,55999777,S,I,TTTT,I,67677,yy
6768880,676999777,S,N,TTTT,N,67677,yy 
6768880,46769777,S,I,TTTT,I,67677,yy

Notes:

The regex /^[^,]*,676/ requires that 676 appear after the first appearance of a comma on the line. In more detail:
- ^ matches the beginning of the line
- [^,]* matches the first column
- ,676 matches the first comma followed by 676
In your desired output, ,I, was replaced with ,N, every time it appeared on the line. To accomplish this, g (meaning global) was added to the substitute command.

score 2 · Answer 2 · answered Apr 09 '15 at 03:47

2

If you are not bound by sed, awk might be a better option for you. Give this a try :

awk -F"," '{match($2,/^676/)&&gsub(",I",",N")}{print}' temp.csc

match syntax does the matching of second column to numbers that starts with (^) 676. gsub replaces I with N.

Result: 6768880,55999777,S,I,TTTT,I,67677,yy 6768880,676999777,S,N,TTTT,N,67677,yy 6768880,46769777,S,I,TTTT,I,67677,yy

answered Apr 09 '15 at 03:47

iamauser

11,119
5
34
52

Could be shorten some to: `awk -F, '{match($2,/^676/)&&gsub(",I",",N")}1'` – Jotne Apr 09 '15 at 05:23
Can be further shortened: `awk -F, '$2~/^676/{gsub(",I",",N")}1'` – glenn jackman Apr 09 '15 at 10:29

Search for a pattern in Column in a CSV and replace another pattern in the same line using sed command

2 Answers2

Linked