46

What is the correct syntax for finding a substring (a string which is preceded and followed by specific strings) which does not match a specific pattern?

For example, I want to take all substrings which start with BEGIN_, end with _END and the substring in between is not equal to FOO; and replace the whole substring with the format "(inner substring)". The following would match:

  • BEGIN_bar_END -> (bar)
  • BEGIN_buz_END -> (buz)
  • BEGIN_ihfd8f398IHFf9f39_END -> (ihfd8f398IHFf9f39)

But BEGIN_FOO_END would not match.

I have played around with the following, but cannot seem to find the correct syntax:

sed -e 's/BEGIN_(^FOO)_END/($1)/g'
sed -e 's/BEGIN_([^FOO])_END/($1)/g'
sed -e 's/BEGIN_(?!FOO)_END/($1)/g'
sed -e 's/BEGIN_(!FOO)_END/($1)/g'
sed -e 's/BEGIN_(FOO)!_END/($1)/g'
sed -e 's/BEGIN_!(FOO)_END/($1)/g'
Anthony
  • 12,177
  • 9
  • 69
  • 105
  • 1
    As a note, when dealing with whole lines, this can be achieved using `!`: http://www.grymoire.com/Unix/Sed.html#uh-32 – Zenexer May 23 '13 at 02:52

4 Answers4

61

There is no general negation operator in sed, IIRC because compilation of regexes with negation to DFAs takes exponential time. You can work around this with

'/BEGIN_FOO_END/b; s/BEGIN_\(.*\)_END/(\1)/g'

where /BEGIN_FOO_END/b means: if we find BEGIN_FOO_END, then branch (jump) to the end of the sed script.

ckujau
  • 225
  • 1
  • 15
Fred Foo
  • 355,277
  • 75
  • 744
  • 836
  • 19
    could also be written `sed '/BEGIN_FOO_END/!s/BEGIN_\(.*\)_END/(\1)/g'` – potong Jan 29 '12 at 15:41
  • 5
    I'd like to note that `sed '/BEGIN_FOO_END/!s|BEGIN_\(.*\)_END|(\1)|g'` works but `sed '|BEGIN_FOO_END|!s|BEGIN_\(.*\)_END|(\1)|g'` does not! Evidently, it lets you substitute a different separator than "/" in the latter section, but not in the first section. Weird. – CommaToast Sep 05 '14 at 20:56
  • 2
    @CommaToast The `s///` command can use an arbitrary delimiter; addresses cannot. – TheDudeAbides Jun 13 '15 at 00:58
  • 2
    Address delimiters *can* be arbitrary, but the first delimiter must be escaped, eg: `printf '%s\n' a b c | sed '\|a|,\|b|d`' - tested with GNU sed with the `--posix` option. – Peter.O Aug 30 '15 at 07:02
  • Kudos for the DFA link! – Graham Nicholls Oct 08 '18 at 10:28
34

This topic may be old, but for the sake of completeness, what about the negation operator ! :

Make all unhappy become VERY HAPPY :

echo -e 'happy\nhappy\nunhappy\nhappy' | sed '/^happy/! s/.*/VERY HAPPY/'

Found this here : How to globally replace strings in lines NOT starting with a certain pattern

Community
  • 1
  • 1
Httqm
  • 799
  • 7
  • 13
  • I always go to the grymoire after I see something related to sed. I checked yours and there it was, right under my nose – hanzo2001 Jul 20 '18 at 11:18
4

This might work for you:

sed 'h;s/BEGIN_\(.*\)_END/(\1)/;/^(FOO)$/g' file

This only works if there is only one string per line.

For multiple strings per line:

sed 's/BEGIN_\([^F][^_]*\|F[^O][^_]*\|FO[^O][^_]*\|FOO[^_]\+\)_END/\(\1\)/g' file

Or the more easily understood:

sed 's/\(BEGIN_\)FOO\(_END\)/\1\n\2/g;s/BEGIN_\([^\n_]*\)_END/(\1\)/g;s/\n/FOO/g' file
potong
  • 55,640
  • 6
  • 51
  • 83
3

I don't know of a pretty way, but you could always do this:

$ cat file
BEGIN_FOO_END
BEGIN_FrOO_END
BEGIN_rFOO_END
$ sed '/BEGIN_FOO_END/ !{s/BEGIN_\([^_]*\)_END/(\1)/}' file 
BEGIN_FOO_END
(FrOO)
(rFOO)
Mat
  • 202,337
  • 40
  • 393
  • 406