Lookahead regex pattern in javascript

Question

I'm trying to match from the following string:

[[NOTE]]This is my note.[[NOTE]]

the following pattern:

This is my note.

As I can't use lookaheads, this is my current attempt:

[^\[\[NOTE\]\]](.*)[^\[\[NOTE\]\]]

But it still doesn't work.

Amal Murali · Answer 1 · 2014-09-04T12:23:13.573

Your current regex wouldn't work. Some points:

You're putting everything inside a character class: [^...]. It makes it so the regex matches a character from the list, which is not what you want in this case. Remove them.
If I understand your question correctly, you don't actually need lookaround expressions for what you're trying to do. And JavaScript does support lookaheads; it doesn't support lookbehinds though.

Assuming you're trying to match the text between the opening and closing [[NOTE]] tags, why not just use:

\[\[NOTE\]\]([^\[\]]+)\[\[NOTE\]\]

Explanation:

\[\[NOTE\]\] matches [[NOTE]]
[^\[\]]+ is a negated character class that matches one or more characters that is not a [ or ].

score 1 · Answer 2 · answered Sep 04 '14 at 12:22

Use what you captured in the first capturing group:

var re = /\[\[NOTE]](.*?)\[\[NOTE]]/; 
var str = '[[NOTE]]This is my note.[[NOTE]]';
var m = re.exec(str);

console.log(m[1])
// This is my note.

Making your * quantifier lazy by adding a ? avoid matching This a note[[NOTE]][[NOTE]]Also note in

[[NOTE]]This a note[[NOTE]][[NOTE]]Also note[[NOTE]]

Avinash Raj · Accepted Answer · 2014-09-04T12:26:11.770

0

You need to escape the opening [ bracket in your regex and also you have to remove the ^ symbol from the character class.

\[\[NOTE]]((?:(?!\[\[NOTE]]).)*)\[\[NOTE]]

DEMO

> var re = /\[\[NOTE]]((?:(?!\[\[NOTE]]).)*)\[\[NOTE]]/g;
undefined
> var str = '[[NOTE]]This is my note.[[NOTE]]';
undefined
> var m;
undefined
> while ((m = re.exec(str)) != null) {
... console.log(m[1]);
... }
This is my note.

edited Sep 04 '14 at 12:26

answered Sep 04 '14 at 12:17

Avinash Raj

172,303
28
230
274

1

Curious as to why you're using `((?:(?!\[\[NOTE]]).)*)` to capture the text in between. That is certainly slower than the other alternatives mainly because it needs to check each position and see if it is not followed by a `[[NOTE]]`. Why not use a negated character class instead? – Amal Murali Sep 04 '14 at 12:32
Why are you using `\[\[NOTE\]\]([^\[\]]+)\[\[NOTE\]\]`, just `.*?` inbetween would do the job? – Avinash Raj Sep 04 '14 at 12:34
2

Because it is more specific than `.*?`. If you take a look at the regex debugger on the demo link you've posted, you can see that your regex finds the match only after **95 steps**, whereas mine would only take 21 steps. – Amal Murali Sep 04 '14 at 12:38
1

@AmalMurali why you downvote my answer? Your answer won't work if it contains `[` in the middle of the content. `[[NOTE]]This is my [note.[[NOTE]]` OP wants the text between two `[[NOTE]]`'s – Avinash Raj Sep 04 '14 at 12:41

score 0 · Answer 4 · edited May 23 '17 at 11:49

First of all, lookahead is fully supported in javascript (lookbehind not).

In your regex [^\[\[NOTE\]\]](.*)[^\[\[NOTE\]\]]

the [] can check only the existence of single characters. For example the [^cat] matches any characters which are not c or a or t.
In the middle you need a lazy quantifier: *? to match only the fist closing token if it is a long template.

Without these mistakes it works perfectly: \[\[NOTE\]\](.*?)\[\[NOTE\]\]

It matches the [[NOTE]]This is my note.[[NOTE]] and with the first capturing group the This is my note..

Another option to capture the opening token and use the backreference by the closing token:

(\[\[NOTE\]\])(.*?)\1

Or you can use a positive lookahead

(\[{2,2}NOTE\]{2,2})(.*?)(?=\1)
so the regex will match only on [[NOTE]]This is my note..

If you want to use a different closing tag, for example [[/NOTE]], then you can use something like these:

\[{2,2}(NOTE)\]{2,2}(.*?)(?=\[{2,2}\/\1\]{2,2})
\[{2,2}(NOTE)\]{2,2}(.*?)\[{2,2}\/\1\]{2,2}

If you want to use nested open-close statements, then the only option to parse the template token by token. By perl compatible regex there is something called recursive regex. With that it is much easier to parse nested open-close templates, but that feature is not available in javascript... If you don't want to support nested structures, then the actual regex will be enough. Use the first capturing group instead of the full match, that's all. Lookaround is not necessary...

Btw. I strongly recommend you to use an existing template system, we don't need another one... What Javascript Template Engines you recommend?

Lookahead regex pattern in javascript

4 Answers4