21

I want to capture all text & blocks of text between <% and %>.

For example:

<html>
<head>
<title>Title Here</title>
</head>
<body>
<% include("/path/to/include") %>
<h1>Test Template</h1>
<p>Variable: <% print(second_var) %></p>
<%

variable = value;

foreach(params here)
{
    code here
}

%>
<p><a href="/" title="Home">Home</a></p>
</body>
</html>

I have tried \<\%(.*)\%\> but that will capture everything including <h1>Test Template</h1> block as well.

General Grievance
  • 4,555
  • 31
  • 31
  • 45
Lark
  • 4,654
  • 7
  • 33
  • 34

3 Answers3

57

Which regex engine are you using?

<%(.*?)%>

should work with the "dot matches newline" option enabled. If you don't know how to set that, try

<%([\s\S]*?)%>

or

(?s)<%(.*?)%>

No need to escape <, %, or > by the way.

Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
9

\<\%(.*?)\%\>. You need to use .*? to get non-greedy pattern matching.

EDIT To solve the multiline problem, you can't use the . wildcard, as it matches everything except newline. This option differs depending on your regular expressions engine. So, I can tell you what to do if you tell me your regex engine.

Rafe Kettler
  • 75,757
  • 21
  • 156
  • 151
3

I've been using Microsoft's Regex engine (provided by JScript in IE) and it has a 'multi-line' switch that effects the behaviour of ., but then still I've had problems I had to resolve using [\u0000-\uFFFF] which matches everything including EOL's or any control chars...

So have a go with <%([\u0000-\uFFFF]*?)%>

Stijn Sanders
  • 35,982
  • 11
  • 45
  • 67
  • The multiline (`m`) modifier does not affect the behavior of `.`. It's the single-line (DOTALL, `s`) modifier that does that, but JavaScript doesn't support it. The most common idiom for matching anything-including-newlines in JavaScript is `[\s\S]`, as @Tim demonstrated in his answer. – Alan Moore Oct 23 '10 at 01:12