sed: Can my pattern contain an "is not" character? How do I say "is not X"?

Question

How do I say "is not" a certain character in sed?

`[^X]` is any char but `X`. PS though we know what you mean, SED should not be capitalised — Sanjay Manohar, Sep 22 '11 at 20:01
@Sanjay Okay I will un-capitalize sed. Can you post an answer in the answer section next time, not the comments? — KRB, Sep 22 '11 at 20:04

score 77 · Accepted Answer · answered Sep 22 '11 at 20:01

77

[^x]

This is a character class that accepts any character except x.

answered Sep 22 '11 at 20:01

Tom Zych

13,329
9
36
53

15

I'll add a warning from Mastering Regular Expressions. Note that for this to match there must be something there. The regex 'su[^x]' will match 'sum' and 'sun' but not 'su'. – johnny Sep 23 '11 at 07:08
1

How do you fix it match the case you mentioned 'su' ? – jcalfee314 Feb 24 '14 at 16:54
@jcalfee314, I believe it should work to add a ? to it like 'su[^x]?' meaning it should match zero or one of those characters. – GradysGhost Mar 09 '16 at 17:10
@GradysGhost No; a trailing repeat is almost never what you want; it means match with or without this final restriction. – tripleee Jul 06 '16 at 06:34
2

So to spell this out, `su[^x]\?` will match "sux" because the substring "su" satisfies the regular expression. You are excluding the x from the match, not preventing something with an x there from matching. (This could work if there is something after the condition, like `su[^x]\?$`) – tripleee Sep 05 '17 at 07:15
4

+1 for answering the question as asked, but might be worthwhile to provide an edit suggesting the use of **alternation** to solve the common practical case people like @jcalfee314 are asking about, where "su[!x]" is intended to also match "su" without any characters after it: `$su[^x]\|su$$` (I think the shorter form `su$[^x]\|$$` might also work). – mtraceur Jan 24 '20 at 15:55

score 19 · Answer 2 · edited May 23 '17 at 11:45

19

For those not satisfied with the selected answer as per johnny's comment.

'su[^x]' will match 'sum' and 'sun' but not 'su'.

You can tell sed to not match lines with x using the syntax below:

sed '/x/! s/su//' file

See kkeller's answer for another example.

edited May 23 '17 at 11:45

Community

1
1

answered Jul 06 '16 at 05:50

Christopher Markieta

5,674
10
43
60

1

This would alse skip `x su` even though it's supposed to match. You could do `/sux/!s/su//` but that will still skip `su sux`where perhaps the first, but not the second, `su` should match. Ideally, the OP should clarify the requirements. – tripleee Jul 06 '16 at 09:19
@tripleee The robust solution is `$su[^x]\|su$$` (I think the shorter form `su$[^x]\|$$` might also work). – mtraceur Jan 24 '20 at 16:01

tripleee · Answer 3 · 2016-07-06T09:28:13.617

There are two possible interpretations of your question. Like others have already pointed out, [^x] matches a single character which is not x. But an empty string also isn't x, so perhaps you are looking for [^x]\|^$.

Neither of these answers extend to multi-character sequences, which is usually what people are looking for. You could painstakingly build something like

[^s]\|s\($\|[^t]\|t\($\|[^r]\)\)\)

to compose a regular expression which doesn't match str, but a much more straightforward solution in sed is to delete any line which does match str, then keep the rest;

sed '/str/d' file

Perl 5 introduced a much richer regex engine, which is hence standard in Java, PHP, Python, etc. Because Perl helpfully supports a subset of sed syntax, you could probably convert a simple sed script to Perl to get to use a useful feature from this extended regex dialect, such as negative assertions:

perl -pe 's/(?:(?!str).)+/not/' file

will replace a string which is not str with not. The (?:...) is a non-capturing group (unlike in many sed dialects, an unescaped parenthesis is a metacharacter in Perl) and (?!str) is a negative assertion; the text immediately after this position in the string mustn't be str in order for the regex to match. The + repeats this pattern until it fails to match. Notice how the assertion needs to be true at every position in the match, so we match one character at a time with . (newbies often get this wrong, and erroneously only assert at e.g. the beginning of a longer pattern, which could however match str somewhere within, leading to a "leak").

+1 Right after the "delete every line which *does* match" `sed` example line, I would also include a "or operate on every line that *does not* match" with a simple `sed` example of how to do that too. (Like `sed '/str/! { ... }` which I think works, but if I'm wrong then there is always `:`, `b` and `t` to build arbitrary conditional branches with.) — mtraceur, Jan 24 '20 at 16:09

score 2 · Answer 4 · answered May 04 '21 at 14:13

2

In addition to all the provided answers , you can negate a character class in sed , using the notation [^:[C_CLASS]:] , for example , [^[:blank:]] will match anything which is not considered a space character .

answered May 04 '21 at 14:13

mohamad.wael

41
2

score 1 · Answer 5 · edited May 23 '17 at 11:53

1

From my own experience, and the below post supports this, sed doesn't support normal regex negation using "^". I don't think sed has a direct negation method...but if you check the below post, you'll see some workarounds. Sed regex and substring negation

edited May 23 '17 at 11:53

Community

1
1

answered Oct 17 '13 at 18:04

A Beckler

19
1

Sort of.. this works: echo 'blah yeah' | sed 's@href="[^http]@href="/@g' – John Hunt Sep 18 '15 at 08:34
Outputs: blah yeah (only matched the relative url) – John Hunt Sep 18 '15 at 08:35
1

This is incorrect. `sed` supports `[^...]` negation just fine. Perhaps your understanding of this construct is incomplete, though? The linked question is useful, though. – tripleee Jul 06 '16 at 06:41
1

But notice that `[^http]` looks for a single character which is not (newline or) h,t, or p. (The second t is completely redundant, but tolerated by the regex engine.) – tripleee Sep 05 '17 at 07:10

sed: Can my pattern contain an "is not" character? How do I say "is not X"?

5 Answers5

Linked