Replacing characters in each line on a file in linux

Question

I have a file with different word in each line. My goal is to replace the first character to a capital letter and replace the 3rd character to "#".

For example: football will be exchanged to Foo#ball.

I tried thinking about using awk and sed.It didn't help me since (to my knowledge) sed needs an exact character input and awk can print the desired character but not change it.

It is hard to say what is wrong with your code because you did not provide it or the errors you encountered. Also see [How to create a Minimal, Complete, and Verifiable example](http://stackoverflow.com/help/mcve). — jww, Apr 20 '19 at 21:45
`sed needs an exact character input`? Completely the opposite, sed cannot operate on an exact character input - see https://stackoverflow.com/q/29613304/1745001 for the hoops you have to jump through to make sed act as if it were using literal strings. — Ed Morton, Apr 21 '19 at 03:23

Cyrus · Accepted Answer · 2019-04-20T20:03:19.493

5

With GNU sed and two s commands:

echo 'football' | sed -E 's/(.)/\U\1/; s/(...)./\1#/'

Output:

Foo#ball

See: 3.3 The s Command, 5.7 Back-references and Subexpressions and 5.9.2 Upper/Lower case conversion

edited Apr 20 '19 at 20:03

answered Apr 20 '19 at 19:48

Cyrus

84,225
14
89
153

David C. Rankin · Answer 2 · 2019-04-21T03:40:28.183

2

With bash you can use parameter expansions alone to accomplish the task. For example, if you read each line into the variable line, you can do:

line="${line^}"                # change football to Football (capitalize 1st char)
line="${line:0:3}#${line:4}"   # make 4th character '#'

Example Input File

$ cat file
football
soccer
baseball

Example Use/Output

$ while read -r line; do line="${line^}"; echo "${line:0:3}#${line:4}"; done < file
Foo#ball
Soc#er
Bas#ball

While shell is typically slower, when use is limited to builtins, it doesn't fall too far behind.

(note: your question says 3rd character, but your example replaces the 4th character with '#')

edited Apr 21 '19 at 03:40

answered Apr 20 '19 at 20:20

David C. Rankin

81,885
6
58
85

The `read` alone will make it extremely slow compared to a sed or awk solution, see https://unix.stackexchange.com/q/169716/133219. Always quote your variables unless you **need** to do something that **requires** them to be unquoted - see https://mywiki.wooledge.org/Quotes. In this case even if we assume there's no spaces in the input you'd get unfortunate results if the input contained globbing chars that just happened to match local file names. – Ed Morton Apr 21 '19 at 03:31
1

Yes, this only does about 100,000 lines per-second (with SSD), so if your file is millions, you will want something faster. If you are dealing with 1000 lines or less, it's in the noise. Agree on the quoting, that was just omitted due to the one-word input -- but it should be there for generalization. – David C. Rankin Apr 21 '19 at 03:36
On my machine it takes 4.81 secs for 100,000 lines :-). FWIW potong's GNU sed solution (unsurprisingly since this is what sed is best for) took 0.17 secs while my GNU awk one took 0.98 secs and reichharts POSIX awk one only took 0.22 secs. Agreed on the "if you're dealing with 1000 lines or less..." BUT then you'd be looking for some other reason to use it over sed or awk. I'm only commenting because you said `when use is limited to builtins, it doesn't fall too far behind` as it does still fall very far behind. – Ed Morton Apr 21 '19 at 03:50
2

I used `/usr/lib/dict/words` for the test `305089-words` (`3.481s` total). I'm not saying its faster, you know that. The `"not too far behind"` could better be expressed as `"not near as bad as calling an additional utility inside the loop"`. `awk` solution `(1.833s total)` same file, run in subshell redirected to `/dev/null`. – David C. Rankin Apr 21 '19 at 03:59

score 2 · Answer 3 · answered Apr 20 '19 at 23:57

2

This might work for you (GNU sed):

sed 's/\(...\)./\u\1#/' file

answered Apr 20 '19 at 23:57

potong

55,640
6
51
83

reichhart · Answer 4 · 2019-04-22T14:16:12.500

1

Cyrus' or Potong's answers are the preferred ones. (For Linux or systems with GNU sed because of \U or \u.)

This is just an additional solution with awk because you mentioned it and used also awk tag:

$ echo 'football'|awk '{a=substr($0,1,1);b=substr($0,2,2);c=substr($0,5);print toupper(a)b"#"c}'
Foo#ball

This is a most simple solution without RegEx. It will also work on non-GNU awk.

edited Apr 22 '19 at 14:16

answered Apr 20 '19 at 20:18

reichhart

813
7
13

Let me phrase it differently :-) : AFAIK I am not using any "GNU extension". Thus the solution should be most compatible. – reichhart Apr 21 '19 at 07:07
1

Thanks. First of all I was not sure about the compatibility and second I was not sure about the (english) language. :-) Updated it. – reichhart Apr 22 '19 at 14:17

score 1 · Answer 5 · answered Apr 21 '19 at 03:27

1

With GNU awk for the 3rd arg to match():

$ echo 'football' | awk 'match($0,/(.)(..).(.*)/,a){$0=toupper(a[1]) a[2] "#" a[3]} 1'
Foo#ball

answered Apr 21 '19 at 03:27

Ed Morton

188,023
17
78
185

Thanks. In this case it's kinda overkill though so if this is really all that needs to be done then in reality I'd have to go with [@potongs](https://stackoverflow.com/a/55778740/1745001) if you have GNU tools and if you don't then [@reichharts](https://stackoverflow.com/a/55777394/1745001). – Ed Morton Apr 21 '19 at 04:07

score 0 · Answer 6 · answered Apr 20 '19 at 20:20

0

This should work with any version of awk:

awk '{
    for(i=1;i<=NF;i++){
        # Note that string indexes start at 1 in awk !
        $i=toupper(substr($i,1,1)) "" substr($i,2,1) "#" substr($i,3)
    }
    print
}' file

Note: If a word is less than 3 characters long, like it, it will be printed as It#

answered Apr 20 '19 at 20:20

hek2mgl

152,036
28
249
266

1

"different word in each line": That means that only one word is used per line. Your solution also works on multiple words in one line. – reichhart Apr 20 '19 at 20:24
True. Well, it works also if there is just one word in each line. :) – hek2mgl Apr 20 '19 at 20:26

score 0 · Answer 7 · answered Apr 21 '19 at 05:53

0

if your data in 'd' file, tried on gnu sed:

sed -E 's/^(\w)(\w\w)\w/\U\1\E\2#/' d

answered Apr 21 '19 at 05:53

Replacing characters in each line on a file in linux

7 Answers7