I have a plugin tag [crayon ...]
that may or may not be rendered in a <p></p>
block like so:
<p>This is a <b>sentence</b> [crayon ...] The Crayon [/crayon] of words. </p>
Since my tag is replaced by a <div>
tag, the <p>
is left disjoint from </p>
and the browser closes it for me, leaving a blank paragraph above my plugin. In any case, the markup is invalid and has weird outcomes. My problem is that I need to detect if [crayon
lies between a <p></p>
block. I have found two ways so far:
- Use
<p(?:\s+[^>]*)?>(.*?)</p(?:\s+[^>]*)?>
and search for[crayon
in the capture. - Use
<p[^>]*>(?:[^<]*<(?!/?p(\s+[^>]*)?>)[^>]+(\s+[^>]*)?>)*[^<]*\[crayon
for the case of<p>...[crayon
where ... doesn't contain a</p>
or<p>
and a similar method for a</p>
after the[crayon]
tag.
The second method is harder to read but will fail if a </p>
is captured before my tag. It doesn't require any further processing to find my tag within the <p></p>
like the first. However, the first regex is much simpler and will execute quicker. Which should I use, and is there a better way?
EDIT:
For method 2, this beast works:
<p[^<]*>(?:[^<]*<(?!/?p(\s+[^>]*)?>)[^>]+(\s+[^>]*)?>)*[^<]*((?:\[crayon[^\]]*\].*?\[/crayon\])|(?:\[crayon[^\]]*/\]))(?:[^<]*<(?!/?p(\s+[^>]*)?>)[^>]+(\s+[^>]*)?>)*[^<]*</p[^<]*>
, why are you using a
, you'll need a proper HTML parser.
with the wpautop() function.
– Aram Kocharyan Jan 21 '12 at 04:59or
` rather than go the regex route.