How to capture within html attributes

Question

<p <%=foo1%> <%=foo2%> >

   <h3><%=bar1%></h3>

   <h4><%=bar2%></h4>

</p>

I am looking for a regular experssion the result of which should be foo1 and foo2 because those are the values declared as attributes. bar1 and bar2 should not be captured because they are not declared as attributes.

I am using ruby 1.8.7.

Please see the first answer here: [RegEx match open tags except XHTML self-contained tags](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) — Svante, Apr 19 '11 at 19:04

score 0 · Answer 1 · answered Apr 19 '11 at 17:08

0

Maaaybe something like

<(?:[^>]*<%=(.*)%>)+[^>]*>

answered Apr 19 '11 at 17:08

Brad

603
4
12

Mike Ryan · Accepted Answer · 2011-04-19T18:39:40.143

This is a case where I think you're better off doing two passes. First, extract all the <% %> data values that are attributes inside tags. Then, go through and extract off the <% and %>.

For example:

 <[^>]*?((?:<%=[^%]*%>\s*)+)[^<]*>

Gives you:

   <%=foo1%> <%=foo2%>

Then, a simple

<%=(.*?)%>

on the output from the first regex, gives you foo1, foo2, etc. I've been trying to construct a combined one, but the only way I can see to do that is to use a look-behind operation. I don't think that's supported in Ruby, and regardless since the look-behind would have to match at the same point multiple times, I believe most engines would kick it out.

score 0 · Answer 3 · answered Apr 19 '11 at 22:39

0

Will this work?

/(?:<|\G)[^<>]*?<%=([^<>]*?)%>/

answered Apr 19 '11 at 22:39

sawa

165,429
45
277
381

I wonder whether whomever downvoted even know what `\G` is. It seems that everyone's answer was downvoted, even the accepted one. – sawa Apr 19 '11 at 22:58
Hmm I dont understand why everyone got downvoted either :/ - have an upvote for what is a perfectly valid suggestion! – David Apr 20 '11 at 08:37

score -1 · Answer 4 · answered Apr 19 '11 at 17:06

-1

How about something like this..

\<\w+\s((.*)\s?)\>

This is assuming you will be running the regex on the output

answered Apr 19 '11 at 17:06

David

8,340
7
49
71

How to capture within html attributes

4 Answers4