0

Adding a Regex101 URL: https://regex101.com/r/7Pd1ri/1/

I'm been working on this problem for about 3 and a half hours.

I can't for the life of me figure out what I'm doing wrong. I'm so close to making it work perfectly but I don't know how to check for the first occurrence of {/IF}

My REGEX:

{IF\s*([^}]+)}([\s\S]+)(?:{\/ELSE})([\s\S]+){\/IF}

The file being parsed:

{IF !isLoggedIn}
    <a href="/sign-in/">Sign In</a>
{ELSE}
    Welcome back, {>username}!<br />
    {# students}
    <li>{#>last_name}, {#>first_name}</li>
    {/#}
{/IF}

{IF isLoggedIn}
    Beta Testing...
{/IF}

For some reason, the regex captures all the way to the second {/IF} as the body of the ELSE

Matthew Auld
  • 388
  • 1
  • 3
  • 18
  • What language is this? – Chava Geldzahler Aug 04 '17 at 00:52
  • Its custom. I'm trying to get REGEX to parse the IF ELSE. – Matthew Auld Aug 04 '17 at 00:59
  • 1
    The question about language is relevant. There are various regex engines, which all differ in syntax and functionality. The flavor of regex you're using is entirely relevant, and the ability to answer requires that detail. If you've written your own regex engine (*its custom*), then you'll need to either make it available to everyone here or solve the problem yourself. – Ken White Aug 04 '17 at 01:04
  • 1
    (1) You need to understand the difference between greedy and non-greedy matching. (2) For what purpose are you writing this template engine? If it's for hobby or academic purposes, then fine. But otherwise I'd highly recommend you adopt a ready made solution such as handlebars or nunjucks. – David L. Walsh Aug 04 '17 at 01:07
  • _"For some reason, the regex captures all the way to the second `{/IF}`"_ See `{IF\s*` at `RegExp`. What is expected result? – guest271314 Aug 04 '17 at 01:09
  • @DavidL.Walsh This is purely education on my end. Just building things from the ground up so I understand how it works. – Matthew Auld Aug 04 '17 at 01:39

1 Answers1

1

Your problem is the greedy behavior of else block capture group:

([\s\S]+)

It kind of means get the biggest possible match, still complying with the rest of regex.

To change that you add ? after + to make it a lazy operator. It now means get the smallest possible match.

Your regex will now be:

{IF\s*([^}]+)}([\s\S]+)(?:{\/ELSE})([\s\S]+?){\/IF}

https://regex101.com/r/7Pd1ri/2

More details of this behavior in What do lazy and greedy mean in the context of regular expressions?

by the way, you'll need to do that in your if block too, otherwise a {ELSE} in your second {IF} will cause same problem

Luizgrs
  • 4,765
  • 1
  • 22
  • 28