Use sed to replace values in a csv column if a condition is met in another column

Question

I have a CSV file composed of several fields split by commas.

id,name,nationality,sex,date_of_birth,height,weight,sport,gold,silver,bronze,info
736041664,A Jesus Garcia,ESP,male,1969-10-17,1.72,64,athletics,0,0,0,

I have to change from lowercases to uppercases the values on column "name" when the column "sport" is shooting or judo. I can only use sed. I am using this command

sed 's/\(.*\),\(.*\),\(.*\),\(.*\),\(.*\),\(.*\),\(.*\),\(.*\),\(.*\),\(.*\),\(.*\),\(.*\)/\1,\U\2\E,\3,\4,\5,\6,\7,\shooting|judo,\9,\10,\11,\12/' athletesv2.csv

But it is not working, as it's just showing "shooting|judo" in all the rows.

How can I make these replacements?

Note that the output must be a .sed file, which has to be called using sed -f script.sed athletes.csv

In the output I need to keep the header.

I am using Ubuntu Linux.

Standard `sed` does not support case conversion. On which platform and/or `sed` version does this need to work? — tripleee, Apr 09 '22 at 16:57
In Ubuntu, I managed to do the case conversion, what i miss is setting the condition in a sed script — Isabel Lopez, Apr 09 '22 at 17:03

score 1 · Answer 1 · answered Apr 09 '22 at 16:41

In case you can use a GNU sed, you can use

rx='^([^,]*),([^,]*),([^,]*,[^,]*,[^,]*,[^,]*,[^,]*,(shooting|judo),[^,]*,[^,]*,[^,]*,[^,]*)$'
repl='\1,\U\2\E,\3'
sed -E "s/$rx/$repl/" athletes.csv

See the online demo:

#!/bin/bash
rx='^([^,]*),([^,]*),([^,]*,[^,]*,[^,]*,[^,]*,[^,]*,(shooting|judo),[^,]*,[^,]*,[^,]*,[^,]*)$'
repl='\1,\U\2\E,\3'

s='id,name,nationality,sex,date_of_birth,height,weight,sport,gold,silver,bronze,info
736041664,A Jesus Garcia,ESP,male,1969-10-17,1.72,64,athletics,0,0,0,
132041664,A Jesus Garcia,ESP,male,1969-10-17,1.72,64,shooting,0,0,0,'

sed -E "s/$rx/$repl/" <<< "$s"

Output:

id,name,nationality,sex,date_of_birth,height,weight,sport,gold,silver,bronze,info
736041664,A Jesus Garcia,ESP,male,1969-10-17,1.72,64,athletics,0,0,0,
132041664,A JESUS GARCIA,ESP,male,1969-10-17,1.72,64,shooting,0,0,0,

Notes:

^([^,]*),([^,]*),([^,]*,[^,]*,[^,]*,[^,]*,[^,]*,(shooting|judo),[^,]*,[^,]*,[^,]*,[^,]*)$ is a pattern that matches a whole string (^ is the start of string and $ matches the end of string) that captures Field 1 and 2 into separate groups and the rest of the string into Group 3. Field 8 pattern is hard-coded, (shooting|judo) either matches shooting or judo.
\U\2\E in the replacement will put Group 2 value back in uppercase.

Note you cannot use more than \9 backreference in sed, so you need to decrease their amount and group those groups that are not used.

Thank you Wiktor. It works perfectly! However i need to create a sed script, it has to be run as sed -f script3.sed athletes.csv. in this sed script i need to add more conditions as: s/ESP/Spain/p s/DEN/Denmark/p ... — Isabel Lopez, Apr 09 '22 at 16:48
Then save the script in a file. Is it acceptakle to put `#!/usr/bin/sed -Ef` in the shebang? — tripleee, Apr 09 '22 at 17:14
If you aren't allowed to use `sed -E` you will basically have to backslash the parentheses and the `|` like in your original code. — tripleee, Apr 09 '22 at 17:16
There is no indication I cannot use sed -E, I will try this way. Thanks a lot — Isabel Lopez, Apr 09 '22 at 17:46

score 1 · Accepted Answer · answered Apr 09 '22 at 17:43

Using sed

$ sed '/^[^,]*,[^,]*,[^,]*,[^,]*,[^,]*,[^,]*,[^,]*,shooting\|judo,/s/,[^,]*/\U&/' input_file
id,name,nationality,sex,date_of_birth,height,weight,sport,gold,silver,bronze,info
736041664,A JESUS GARCIA,ESP,male,1969-10-17,1.72,64,shooting,0,0,0,

score 1 · Answer 3 · answered Apr 10 '22 at 17:22

This might work for you (GNU sed):

sed -E 'h;x;s/[^,]*/\n&\n/8;/\n(shooting|judo)\n/{x;s/[^,]*/\U&/2;x};x' file

Make a copy of the current line.

Surround the copy of the eighth field by newlines and if that field contains either shooting or judo, uppercase the second field in the unadulterated version.

Use sed to replace values in a csv column if a condition is met in another column

3 Answers3