substituting chemical atomic numbers using sed

Question

I am trying to substitute some patterns of atomic numbers in a single file. That file contain a series of atomic numbers in a column as shown in the first column. Now I want to substitute the first column of numbers with the series of numbers as in the second column line after line.

C1  C21
C2  C22
C4  C23
C5  C24
C6  C25
C7  C26
C8  C27
C9  C28
C10 C29
C11 C30
C12 C31
C13 C32
C14 C33

O1  O11
O2  O12
O3  O13
O4  O14
O5  O15
O6  O16

H1  H31
H2  H32
H3  H33
H4  H34
H5  H35
H6  H36
H7  H37
H8  H38
H9  H39
H10 H40
H11 H41
H12 H42
H13 H43
H14 H44
H15 H45
H16 H46
H17 H47
H18 H48
H19 H49
H20 H50
H21 H51
H22 H52
H23 H53
H24 H54
H25 H55
H26 H56
H27 H57
H28 H58

To achieve this I tried the sed command as below

 sed -i -e 's/C1/C21/;s/C2/C22/;s/C3/C23/;s/C4/C24/;s/C5/C25/;s/C6/C26/;s/C7/C27/;s/C8/C28/;s/C9/C29/;s/C10/C30/;s/C11/C31/;s/C12/C32/;s/C13/C33/;s/C14/C34/;s/O1/O11/;s/O2/O12/;s/O3/O13/;s/O4/O14/;s/O5/O15/;s/O6/O16/;s/H1/H31/;s/H2/H32/;s/H3/H33/;s/H4/H34/;s/H5/H35/;s/H6/H36/;s/H7/H37/;s/H8/H38/;s/H9/H39/;s/H10/H40/;s/H11/H41/;s/H12/H42/;s/H13/H43/;s/H14/H44/;s/H15/H45/;s/H16/H46/;s/H17/H47/;s/H18/H48/;s/H19/H49/;s/H20/H50/;s/H21/H51/;s/H22/H52/;s/H23/H53/;s/H24/H54/;s/H25/H55/;s/H26/H56/;s/H27/H57/;s/H28/H58/' FILE_NAME

Unfortunately, what I get is multiple substitutions like C3328 and so on.

Can anyone help me to address the correct way of doing this? Appreciate in advance.

You mean : `awk '{print $2,$2}' file > newfile` (OR) More understandable : `awk '{sub($1,$2,$1); print}'` — sat, Jun 14 '16 at 07:48
@Mark Setchell. I think the posted text is some patterns, not the input file. — waltersu, Jun 14 '16 at 08:08
That's a weird approach you have. If you are okay with manually writing all the substitutions line by line, you might as well rewrite the file manually... See my answer for a generic solution — neric, Jun 14 '16 at 09:14
Possible duplicate of [Replace a field with values specified in another file](http://stackoverflow.com/questions/12400217/replace-a-field-with-values-specified-in-another-file) — tripleee, Jun 14 '16 at 09:37
[edit] your question to show the expected output given your posted sample input as we can't tell from reading a script that doesn't do what you want what it is you DO want. Also, if you're the one downvoting all the posted answers then please stop - it's your fault if they don't do what you want as you haven't told us what you want yet. — Ed Morton, Jun 14 '16 at 12:55
@MarkSetchell I don't mean swap the columns but substitute. For example C1 becomes C21 and C2 becomes C22 and so on. More elaborately I can say that I have the C1, C2, C3 and so on in a file. I want to make change so that the C1 becomes C21 and C2 becomes C22. thanks — Vijay, Jun 15 '16 at 03:09
This definitely looks like a job for Awk, or even Perl or Python. Sed is basically just a text editor. — Dietrich Epp, Jun 15 '16 at 03:19

Ed Morton · Accepted Answer · 2016-06-15T13:35:25.810

0

It's still not clear but I THINK this is what you want:

$ cat tst.awk
BEGIN { cnt["C"]=21; cnt["O"]=11; cnt["H"]=31 }
NF { c=substr($0,1,1); $0=c cnt[c]++ }
{ print }

.

$ awk -f tst.awk file
C21
C22
C23
C24
C25
C26
C27
C28
C29
C30
C31
C32
C33

O11
O12
O13
O14
O15
O16

H31
H32
H33
H34
H35
H36
H37
H38
H39
H40
H41
H42
H43
H44
H45
H46
H47
H48
H49
H50
H51
H52
H53
H54
H55
H56
H57
H58

edited Jun 15 '16 at 13:35

answered Jun 14 '16 at 12:59

Ed Morton

188,023
17
78
185

Michael Vehrs · Answer 2 · 2016-06-14T09:29:43.963

-1

The problem is that sed will attempt to carry out all substitutions in order, which results in multiple substitutions. So you need to rearrange your substitutions from most specific to least specific. For example:

echo "C1" | sed -n 's/C1/C21/p; s/C2/C22/p; s/C3/C23/p'
C21
C221
echo "C1" | sed -n 's/C3/C23/p; s/C2/C22/p; s/C1/C21/p'
C21

edited Jun 14 '16 at 09:29

answered Jun 14 '16 at 08:38

Michael Vehrs

3,293
11
10

pdg · Answer 3 · 2016-06-15T08:34:30.797

-2

put [^0-9] after each pattern should work fine, to automate this process:

awk '$0{printf("s/%s\\([^0-9]\\)/%s\\1/g\n", $1, $2)}' <pattern-file >sedscr

run this one-liner for the pattern file, cat sedscr, then you would get:

s/C1\([^0-9]\)/C21\1/g
s/C2\([^0-9]\)/C22\1/g
s/C4\([^0-9]\)/C23\1/g
...

after that you run sed with the generated script for your sample files.

sed -f sedscr sample-files...

edited Jun 15 '16 at 08:34

answered Jun 14 '16 at 08:33

pdg

103
8

downvoted as using awk to generate a script to execute with sed is never the right approach. – Ed Morton Jun 14 '16 at 13:02
@EdMorton I don't see what's wrong in it. Surely you could use awk to generate a script to execute with awk, but I don't see any difference here If you truly understand the problem. – pdg Jun 15 '16 at 08:27
@EdMorton you should understand the problem correctly before answering it or judging other answers. – pdg Jun 15 '16 at 08:58
There are some things that are just wrong no matter what problem you are trying to solve. Generating a sed script with awk is one of them. – Ed Morton Jun 15 '16 at 13:19

substituting chemical atomic numbers using sed

3 Answers3