0

I have a CSV file composed of several fields split by commas.

id,name,nationality,sex,date_of_birth,height,weight,sport,gold,silver,bronze,info
736041664,A Jesus Garcia,ESP,male,1969-10-17,1.72,64,athletics,0,0,0,

I have to change from lowercases to uppercases the values on column "name" when the column "sport" is shooting or judo. I can only use sed. I am using this command

sed 's/\(.*\),\(.*\),\(.*\),\(.*\),\(.*\),\(.*\),\(.*\),\(.*\),\(.*\),\(.*\),\(.*\),\(.*\)/\1,\U\2\E,\3,\4,\5,\6,\7,\shooting|judo,\9,\10,\11,\12/' athletesv2.csv

But it is not working, as it's just showing "shooting|judo" in all the rows.

How can I make these replacements?

Note that the output must be a .sed file, which has to be called using sed -f script.sed athletes.csv

In the output I need to keep the header.

I am using Ubuntu Linux.

tripleee
  • 175,061
  • 34
  • 275
  • 318
Isabel Lopez
  • 101
  • 6

3 Answers3

1

In case you can use a GNU sed, you can use

rx='^([^,]*),([^,]*),([^,]*,[^,]*,[^,]*,[^,]*,[^,]*,(shooting|judo),[^,]*,[^,]*,[^,]*,[^,]*)$'
repl='\1,\U\2\E,\3'
sed -E "s/$rx/$repl/" athletes.csv

See the online demo:

#!/bin/bash
rx='^([^,]*),([^,]*),([^,]*,[^,]*,[^,]*,[^,]*,[^,]*,(shooting|judo),[^,]*,[^,]*,[^,]*,[^,]*)$'
repl='\1,\U\2\E,\3'

s='id,name,nationality,sex,date_of_birth,height,weight,sport,gold,silver,bronze,info
736041664,A Jesus Garcia,ESP,male,1969-10-17,1.72,64,athletics,0,0,0,
132041664,A Jesus Garcia,ESP,male,1969-10-17,1.72,64,shooting,0,0,0,'

sed -E "s/$rx/$repl/" <<< "$s"

Output:

id,name,nationality,sex,date_of_birth,height,weight,sport,gold,silver,bronze,info
736041664,A Jesus Garcia,ESP,male,1969-10-17,1.72,64,athletics,0,0,0,
132041664,A JESUS GARCIA,ESP,male,1969-10-17,1.72,64,shooting,0,0,0,

Notes:

  • ^([^,]*),([^,]*),([^,]*,[^,]*,[^,]*,[^,]*,[^,]*,(shooting|judo),[^,]*,[^,]*,[^,]*,[^,]*)$ is a pattern that matches a whole string (^ is the start of string and $ matches the end of string) that captures Field 1 and 2 into separate groups and the rest of the string into Group 3. Field 8 pattern is hard-coded, (shooting|judo) either matches shooting or judo.
  • \U\2\E in the replacement will put Group 2 value back in uppercase.

Note you cannot use more than \9 backreference in sed, so you need to decrease their amount and group those groups that are not used.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • Thank you Wiktor. It works perfectly! However i need to create a sed script, it has to be run as sed -f script3.sed athletes.csv. in this sed script i need to add more conditions as: s/ESP/Spain/p s/DEN/Denmark/p ... – Isabel Lopez Apr 09 '22 at 16:48
  • 2
    Then save the script in a file. Is it acceptakle to put `#!/usr/bin/sed -Ef` in the shebang? – tripleee Apr 09 '22 at 17:14
  • 1
    If you aren't allowed to use `sed -E` you will basically have to backslash the parentheses and the `|` like in your original code. – tripleee Apr 09 '22 at 17:16
  • There is no indication I cannot use sed -E, I will try this way. Thanks a lot – Isabel Lopez Apr 09 '22 at 17:46
1

Using sed

$ sed '/^[^,]*,[^,]*,[^,]*,[^,]*,[^,]*,[^,]*,[^,]*,shooting\|judo,/s/,[^,]*/\U&/' input_file
id,name,nationality,sex,date_of_birth,height,weight,sport,gold,silver,bronze,info
736041664,A JESUS GARCIA,ESP,male,1969-10-17,1.72,64,shooting,0,0,0,
HatLess
  • 10,622
  • 5
  • 14
  • 32
1

This might work for you (GNU sed):

sed -E 'h;x;s/[^,]*/\n&\n/8;/\n(shooting|judo)\n/{x;s/[^,]*/\U&/2;x};x' file

Make a copy of the current line.

Surround the copy of the eighth field by newlines and if that field contains either shooting or judo, uppercase the second field in the unadulterated version.

potong
  • 55,640
  • 6
  • 51
  • 83