-1

I use an older language that doesn't have any prebuilt syntax highlighters.

Although notepad++ has user defined language features there are some cases that aren't supported. For these cases I use a python script that applies syntax highlighting using regex.

My current issue I have run into is using regex to find patterns within curly braces. The pattern I'm trying to match is [A-Za-z_]\w* . So basically, a variable name. However I would like to match only instances that occur within double curly braces.

In the following string I would like to match both instances of TimeStamps and Descending and nothing else. Test Test2'{{TimeStamps(Descending(1))(7:8)}}/{{TimeStamps(Descending(1))(1:4)}} - '

I have tried variations of this (?<={{)([A-Za-z_]\w*)*(?=[0-9\(\)\:]*}}), however it feels like I'm over complicating it for myself.

Any help is appreciated. Thanks in advance.

Axe319
  • 4,255
  • 3
  • 15
  • 31
  • 2
    Is the pattern for python or for notepad? You might use capturing groups `{{(\w+)[^\(]*\(([^()]+).*?}}` https://regex101.com/r/PCvoDi/1 – The fourth bird Feb 06 '20 at 14:05
  • 1
    Use `{{.*?}}` and then get each match inside the previous matches with your `[A-Za-z_]\w*` – Wiktor Stribiżew Feb 06 '20 at 14:06
  • Unfortunately, python is just the vehicle that I'm using to feed the regex strings into notepad++. There's a library which takes a regex string and a color and it applies that color to all matches. – Axe319 Feb 06 '20 at 14:13
  • 1
    So, what is the input, and expected output? Do you have access to Python code? Can you use PyPi `regex` module rather than `re`? – Wiktor Stribiżew Feb 06 '20 at 14:28
  • I'm not using `re`. I'm sending a raw string and an RGB color to a notepad++ library and having it do the work for me. https://regex101.com/r/6jxkfz/1 Here is the expected output. However that is way more complex than I want it and only supports 2 variables. – Axe319 Feb 06 '20 at 14:38
  • Not sure what is supported, but perhaps try `(?:{{(?=(?:(?!(?:{{|}})).)*}})|\G(?!^))(?:\(?\K[A-Za-z_]\w*(?=\())` https://regex101.com/r/TlWMOs/1 – The fourth bird Feb 06 '20 at 14:42
  • @Thefourthbird That is really close but there are cases where it doesn't work. Example: `'{{ListHeight + 4}}, {{(ListWidth / 2) + 0.5}}, cc 6, '`. I should mention that these are essentially like python's Fstrings and any variable or math operation can be placed inside them. – Axe319 Feb 06 '20 at 14:53
  • Or without the lookahead https://regex101.com/r/B8OtSz/1 – The fourth bird Feb 06 '20 at 14:55
  • @Thefourthbird There are still a few edge cases where it fails. But those are easy enough to stamp out with additional regexes. If you post that as an answer with an explanation I'll accept it. – Axe319 Feb 06 '20 at 15:06
  • 1
    Try a workaround, `[A-Za-z_]\w*(?=(?:(?!{{|}}).)*}})` (or ``[A-Za-z_]\w*(?=(?:(?!{{|}})[\s\S])*}})``) – Wiktor Stribiżew Feb 06 '20 at 15:15
  • @WiktorStribiżew I've been using it ever since and it works perfectly! Although there was one caveat I failed to mention. It also needs to not match any strings, however I just applied another string matching regex immediately afterwards and it works perfectly. – Axe319 Feb 17 '20 at 11:23

2 Answers2

2

You may use

\b[A-Za-z_]\w*(?=(?:(?!{{|}}).)*}})

See the regex demo

Details

  • \b - a word boundary
  • [A-Za-z_] - an ASCII letter or _
  • \w* - any 0 or more word chars (letters, digits or _)
  • (?=(?:(?!{{|}}).)*}}) - a positive lookahead that requires
    • (?:(?!{{|}}).)* - any char, 0 or more occurrences, other than line break chars, that does not start a {{ or }} substring (it is a tempered greedy token)
    • }} - a }} substring.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
1

You could use your pattern to match the variable by first matching {{ and then assert that there is a closing }} present without matching {{ in between.

(?:{{(?=[^{}]*(?:{(?!{)|}(?:!}))*}})|\G(?!^))\(?\K[A-Za-z_]\w*
  • (?: Non capture group
    • {{ Match literally
    • (?= Positive lookahead, assert what is on the right is
      • [^{}]* Match 0+ occurrences of any char except { or }
      • (?:{(?!{)|}(?:!}))* If there is { or } assert it is not followed by the same char
      • }} Match literally
    • ) Close lookahead
    • | Or
  • \G(?!^) Assert the position at the end of the previous match
  • ) Close group
  • \(?\K match optional ( and forget what is currently matched
  • [A-Za-z_]\w* Pattern to match the varialbe

Regex demo

The fourth bird
  • 154,723
  • 16
  • 55
  • 70
  • I just want to mention that this one `[A-Za-z_]\w*(?=(?:(?!{{|}}).)*}})` works perfectly for my use case. Thanks so much for all your time. – Axe319 Feb 06 '20 at 15:50
  • 1
    @Axe319 You are welcome. It is certainly a good pattern but I think that it will also match without the opening double curly's `ListHeight + 4}}` You could ask @WiktorStribiżew to post it as an answer and accept his if the pattern works better for your use case. – The fourth bird Feb 06 '20 at 15:53