2

I have 4 different sed commands which I am running on a file. And in order to tune in the performance of these 4 commands, I want to combine them into one. Each command is a complex command with -E switch. Searched many many forums but could not get my specific answer.

sed -i -E ':a; s/('"$search_str"'X*)[^X&]/\1X/; ta' "$newfile"
sed -i -E '/[<]ExtData[>?" "]/{:a; /Name=/{/Name="'"$nvp_list_ORed"'"/!b}; /Value=/bb; n; ba; :b; s/(Value="X*)[^X"]/\1X/; tb; }' "$newfile"
sed -i -E ':a; s/('"$search_str1"'X*)[^X\<]/\1X/; ta' "$newfile"
sed -i -E ':a; s/('"$search_str2"'X*)[^X\/]/\1X/; ta' "$newfile"

And i want to combine them say something like

sed -i -E 'command1' -e 'command2' -e 'command3' -e 'command4' "$newfile"

But it is not working. Because may be -E and -e can't be combine.

Please let me know.

Thanks !! Puneet

Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
Puneet Jain
  • 97
  • 1
  • 10
  • 1
    If you are on Mac OSX or another BSD system, then `-i` requires an argument. An empty argument suffices: `sed -i "" -E -e 'command1' -e 'command2' -e 'command3' -e 'command4' "$newfile"` – John1024 Aug 24 '16 at 18:32
  • Is `-E` supposed to be `-r`? – Mad Physicist Aug 24 '16 at 18:48
  • @MadPhysicist On modern GNU sed (since version 4.2.1), `-r` and `-E` are synonyms.. On BSD, only `-E` works. Rumor has it that POSIX is going with `-E` as the standard. – John1024 Aug 24 '16 at 18:52

4 Answers4

3

-E means "extended regex" and is a standalone flag, -e means "expression" and must be followed by a sed expression.
You can combine them, but each of your sed expression must be preceded by a -e if you want multiple of them, which isn't the case of your first one.

sed -i -E -e 'command1' -e 'command2' -e 'command3' -e 'command4' "$newfile"

A second option is to write each command in the same expression :

sed -i -E 'command1;command2;command3;command4' "$newfile"

However, since you're using labels I wouldn't rely on this option ; some implementations may not support it as John1024 pointed out.

Lastly, as mentionned by Mad Physicist, you can write your sed expressions to a file which you'll reference through the -f option.
The file must contain a single sed expression by line (you can write multiline expressions by suffixing each line but the last by a \, thus escaping the line-feed).

Community
  • 1
  • 1
Aaron
  • 24,009
  • 2
  • 33
  • 57
2

Simply pipe them:

sed -E 'A' file | sed -E 'B' | ... >file.tmp && mv file.tmp file
John1024
  • 109,961
  • 14
  • 137
  • 171
Alexey Soshin
  • 16,718
  • 2
  • 31
  • 40
  • While this will definitely remove the need for the in-between temp-files, it is probably not as optimal as doing everything with one command. +1 anyway. – Mad Physicist Aug 24 '16 at 18:32
  • @John1024. Good call. I think that answer meant `|` instead of `||` based on the word "pipe". And, yes the `-i` flag needs to be removed then. Until then, this answer is nonsense. – Mad Physicist Aug 24 '16 at 18:36
  • Thank you for your comment, fixed the syntax. Anyway, @Aaron answer is the more performant. – Alexey Soshin Aug 24 '16 at 18:55
  • 1
    @AlexeySoshin Very good. I just added in the code to simulate the `-i` option. Revert if you don't like it. – John1024 Aug 24 '16 at 18:56
  • 1
    piping into sed again would require to iterate trough the whole file again and again. Having the fact that sed supports multiple commands natively I would say this is wrong. – hek2mgl Aug 24 '16 at 19:06
2

As @Aaron observed, if you want to give multiple separate expressions to sed, you must designate them as -e options; they will be combined. You can also combine a bunch of expressions into one by separating the pieces with semicolons.

Your case is a bit special however: your particular expressions use labels and branch instructions, with one of the label names (a) repeated in each expression. In order to combine these, each label should be distinct, and each branch (either conditional and absolute) should specify the correct label. That would look something like this:

sed -i -E \
    -e ':a1; s/('"$search_str"'X*)[^X&]/\1X/; ta1' \
    -e '/[<]ExtData[>?" "]/ {:a2; /Name=/ {/Name="'"$nvp_list_ORed"'"/ !b}; /Value=/ bb2; n; ba2; :b2; s/(Value="X*)[^X"]/\1X/; tb2; }' \
    -e ':a3; s/('"$search_str1"'X*)[^X\<]/\1X/; ta3' \
    -e ':a4; s/('"$search_str2"'X*)[^X\/]/\1X/; ta4' \
    "$newfile"

Do note that even with proper quoting from a shell perspsective, which you appear to have, your approach will not do what you expect if the value of any of the interpolated shell variables contains a regex metacharacter.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
0

Warning: It is not always possible to combine multiple sed scripts into a single one without change. Sometimes you might have to do a redesign of your algorithm.


Sed makes has two concepts of memory. The pattern space and the hold space. Concatenation is only working if these two spaces are identical in both sed commands. Below you find an example where the pattern space changes:

$ echo aa | sed -e 's/./&\n/' | sed -e '1s/a/b/g'
b
a
$ echo aa | sed -e 's/./&\n/' -e '1s/a/b/g'
b
b
$ echo aa | gsed -e 's/./&\n/;1s/a/b/g'
b
b

In the original pipeline, the first sed command works on the pattern space aa, while the second script's pattern space is only a.

kvantour
  • 25,269
  • 4
  • 47
  • 72