0

I am trying to build a "super non-greedy" regex, for lack of a better phrase. The problem I'm having is distilled as follows:

https://regex101.com/r/wuwOGd/2

Regex: \/\*\*(.*?)\*\/\w+\d+ Sample String: /**word1*/asdf /**word2*/abc123

What I want it to do: Only match the second token so I can extract word2.

What it's doing: Matching word1*/asdf /**word2 which is technically correct, so I can't blame the regex for doing what I told it to do. But is there a way I can have the regex "fail" as soon as it has to expand beyond the first */?

I'm using python in this particular case to match comment blocks attached to functions with certain signatures.

Edit:

As pointed out below, it turns out the magic word I was searching for was "tempered", not "super"!

Gillespie
  • 5,780
  • 3
  • 32
  • 54

2 Answers2

2

You can use negated class instead of non-greedy repetition: \/\*\*([^*]*)\*\/\w+\d+ https://regex101.com/r/wuwOGd/3

as the token you look for is delimited with * it's quite safe.

mrzasa
  • 22,895
  • 11
  • 56
  • 94
  • 2
    If capture value is `wo*rd2` then it may not work but may be `*` is not allowed there – anubhava Feb 08 '18 at 16:18
  • 1
    Sorry, I wasn't clear. The delimiter must be `*/` since it is a comment block. So the following string breaks it: `/**word2 * super cool*/abc123` - should extract `word2 * super cool` – Gillespie Feb 08 '18 at 16:19
2

See regex in use here

/\*{2}((?:(?!\*/).)*)\*/\w+\d+

Alternatively, without having to capture it (assuming PCRE). See regex in use here

/\*{2}\K(?:(?!\*/).)*(?=\*/\w+\d+)

This method uses a tempered greedy token to ensure it matches any character except where */ is found.

ctwheels
  • 21,901
  • 9
  • 42
  • 77
  • Yes! This is exactly what I was looking for. I've never used negative lookahead or tempered greedy token, so this answer gives me the right phrases to google and learn more. Thanks! – Gillespie Feb 08 '18 at 16:23