-1

Below is my sample data file.

$ cat test.conf
Options -Indexes -FollowSymLinks -Includes -ExecCGI
#Options -Indexes -FollowSymLinks -Includes -ExecCGI
<Directory "/web/htdocs">
    Options -Indexes -FollowSymLinks -Includes -ExecCGI
<Directory "/web/htdocs"> Options -Indexes -FollowSymLinks -Includes -ExecCGI
    RewriteOptions -tester
    Options      -Indexes -FollowSymLinks -Includes -ExecCGI

I wish to get all the entries (only single lines) using regex that

Begins(start) with the string "Options" followed by an of these strings "Indexes" "FollowSymLinks" "Includes" "ExecCGI"

I tried the below regex but the problem is that it is yeilding the output "#Options -Indexes -FollowSymLinks -Includes -ExecCGI" where the line begins with a hash '#' as well as the line "<Directory "/web/htdocs"> Options -Indexes -FollowSymLinks -Includes -ExecCGI" which do not begin with Options on a new line.

Current Output:

$ grep -E '^[^\n|#]*[^!Rewrite]Options.*|Indexes|FollowSymLinks|Includes|ExecCGI$' test.conf
Options -Indexes -FollowSymLinks -Includes -ExecCGI
#Options -Indexes -FollowSymLinks -Includes -ExecCGI
    Options -Indexes -FollowSymLinks -Includes -ExecCGI
<Directory "/web/htdocs"> Options -Indexes -FollowSymLinks -Includes -ExecCGI
    Options      -Indexes -FollowSymLinks -Includes -ExecCGI

Desired Output:

Options -Indexes -FollowSymLinks -Includes -ExecCGI
    Options -Indexes -FollowSymLinks -Includes -ExecCGI
    Options -Indexes -FollowSymLinks -Includes -ExecCGI

I'm not looking for grep -v as a solution but a regex negation instead.

Can you please suggest a regex to meet my requirement ?

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
Ashar
  • 2,942
  • 10
  • 58
  • 122
  • Is it for `python` or `grep`? – Wiktor Stribiżew Feb 05 '20 at 08:51
  • Perhaps like this `^[^\S\r\n]*Options[^\S\r\n]*-(?:Indexes|FollowSymLinks|Includes|ExecCGI)\b.*` https://regex101.com/r/ABGR8r/1 – The fourth bird Feb 05 '20 at 08:53
  • `[!Rewrite]` matches a single character which is `!` or `R` or `e` etc. If you were trying to do a negative character class, that's `[^Reirtw]` to match a single character which is not `R` or `e` or `i` or `r` or `t` or `w`. If you were trying to say "not `Rewrite`" there is no simple way in traditional `grep` to do that, though if you have `grep -P` you could use negative lookaheads. – tripleee Feb 05 '20 at 09:42

1 Answers1

1

One option using grep could be using a group with an alternation | and match zero or more spaces or tabs at the start [[:blank:]]*

$grep -E '^[[:blank:]]*Options[[:blank:]]+-(Indexes|FollowSymLinks|Includes|ExecCGI)\b' test.conf

Output

Options -Indexes -FollowSymLinks -Includes -ExecCGI
    Options -Indexes -FollowSymLinks -Includes -ExecCGI
    Options      -Indexes -FollowSymLinks -Includes -ExecCGI

Using Python, you might use

^[^\S\r\n]*Options[^\S\r\n]+-(?:Indexes|FollowSymLinks|Includes|ExecCGI)\b.*$
  • ^ Start of string
  • [^\S\r\n]* Match 0+ times a whitespace char except a newline
  • Options[^\S\r\n]+ MatchOptions` followed by matching 1+ times a whitespace char except a newline
  • - Match literally
  • (?: Non capturing group
    • Indexes|FollowSymLinks|Includes|ExecCGI Match 1 of the options
  • )\b Close group and use a word boundary to prevent the words being part of a larger word
  • .* Match 0+ times any char except a newline
  • $ Assert end of string

Regex demo`

The fourth bird
  • 154,723
  • 16
  • 55
  • 70