0

I have an XML file that I am finding and replacing emails and usernames in. It's all good but to avoid some duplicate user emails etc.. I am wanting to skip XML elements of specific types.

I can do this if I want to skip ONE specific time i.e.

/ApplicationUser/!s/"user.name"/"user.name@abc.com"/g

But not if I try multiple on the one sed command

/(OtherElement|ApplicationUser)/!s/"user.name"/"user.name@abc.com"/g

OR

/\(OtherElement\|ApplicationUser\)/!s/"user.name"/"user.name@abc.com"/g

OR

/\(OtherElement|ApplicationUser\)/!s/"user.name"/"user.name@abc.com"/g

I am loading in the commands from a file if that is relevant. I'm assuming it has something to do with my pattern at the start trying to match 1 or more words but not sure.

Cyrus
  • 84,225
  • 14
  • 89
  • 153
Colin Goudie
  • 845
  • 5
  • 10

3 Answers3

1

So, the regular expression syntax depends on the version of sed you're using.

First off, according to the POSIX specification, basic regular expressions (BRE) do not support alternation. However, tools do not necessarily follow the specification and, in particular, different versions of sed have different behavior.

The examples below are all processing this file:

$ cat sed-re-test.txt
OtherElement "user.name"
OnlyReplaceMe "user.name"
ApplicationUser "user.name"

GNU sed

The GNU sed BRE variant supports alternation but the | metacharacter (along with ( and )) must be escaped with a \. If you use -E flag to enable Extended Regular Expressions (ERE), then the metacharacters must not be escaped.

$ sed --version
sed (GNU sed) 4.4
<...SNIP...>

GNU sed BRE variant (with escaped metacharacters): WORKS

$ cat sed-re-test.txt  | sed '/\(OtherElement\|ApplicationUser\)/!s/"user.name"/"user.name@abc.com"/g'
OtherElement "user.name"
OnlyReplaceMe "user.name@abc.com"
ApplicationUser "user.name"

GNU sed ERE (with unescaped metacharacters): WORKS

$ cat sed-re-test.txt  | sed -E '/(OtherElement|ApplicationUser)/!s/"user.name"/"user.name@abc.com"/g'
OtherElement "user.name"
OnlyReplaceMe "user.name@abc.com"
ApplicationUser "user.name"

BSD/MacOS sed

BSD sed does not support alternation in BRE mode. You must use -E to enable alternation support.

No --version flag, so identifying the OS will have to do:

$ uname -s
OpenBSD

BSD sed BRE (with escaped and unescaped metacharacters): DOES NOT WORK

$ cat sed-re-test.txt  | sed '/\(OtherElement\|ApplicationUser\)/! s/"user.name"/"user.name@abc.com"/'
OtherElement "user.name@abc.com"
OnlyReplaceMe "user.name@abc.com"
ApplicationUser "user.name@abc.com"

$ cat sed-re-test.txt  | sed '/(OtherElement|ApplicationUser)/! s/"user.name"/"user.name@abc.com"/'
OtherElement "user.name@abc.com"
OnlyReplaceMe "user.name@abc.com"
ApplicationUser "user.name@abc.com"

BSD sed ERE (with unescaped metacharacters): WORKS

$ cat sed-re-test.txt  | sed -E '/(OtherElement|ApplicationUser)/! s/"user.name"/"user.name@abc.com"/'
OtherElement "user.name"
OnlyReplaceMe "user.name@abc.com"
ApplicationUser "user.name"
chuckx
  • 6,484
  • 1
  • 22
  • 23
  • 2
    BRE do support alternation.. like `()` it needs escaping `\|` – Sundeep May 31 '18 at 07:34
  • 1
    oops, I thought `\|` was part of POSIX spec.. anyway, here's a comprehensive comparison: https://stackoverflow.com/questions/24275070/sed-not-giving-me-correct-substitute-operation-for-newline-with-mac-difference/ – Sundeep May 31 '18 at 09:23
  • That's a great comparison. – chuckx May 31 '18 at 20:01
0

This might work for you (GNU sed):

sed '/OtherElement\|ApplicationUser/b;s/"user.name"/"user.name@abc.com"/g file

On encountering a line which you do not want to process, break out, fetch the next and repeat.

potong
  • 55,640
  • 6
  • 51
  • 83
0

Just use awk and avoid the convoluted, backwards logic (if X do NOT do Y but do Y for everything else vs the simple if NOT X do Y) and the version-specific constructs that you get with sed.

awk '!/OtherElement|ApplicationUser/{ gsub(/"user.name"/,"\"user.name@abc.com\"") } 1' file

That is clear, simple, extensible and will work with any awk in any shell on any UNIX box.

Ed Morton
  • 188,023
  • 17
  • 78
  • 185