I have this block of html:
<div>
<p>First, nested paragraph</p>
</div>
<p>First, non-nested paragraph.</p>
<p>Second paragraph.</p>
<p>Last paragraph.</p>
I'm trying to select the first, non-nested paragraph in that block. I'm using PHP's (perl style) preg_match to find it, but can't seem to figure out how to ignore the p tag contained within the div.
This is what I have so far, but it selects the contents of the first paragraph contained above.
/<p>(.+?)<\/p>/is
Thanks!
EDIT
Unfortunately, I don't have the luxury of a DOM Parser.
I completely appreciate the suggestions to not use RegEx to parse HTML, but that's not really helping my particular use case. I have a very controlled case where an internal application generated structured text. I'm trying to replace some text if it matches a certain pattern. This is a simplified case where I'm trying to ignore text nested within other text and HTML was the simplest case I could think of to explain. My actual case looks something a little more like this (But a lot more data and minified):
#[BILLINGCODE|12345|11|15|2001|15|26|50]#
[ITEM1|{{Escaped Description}}|1|1|4031|NONE|15]
#[{{Additional Details }}]#
[ITEM2|{{Escaped Description}}|3|1|7331|NONE|15]
[ITEM3|{{Escaped Description}}|1|1|9431|NONE|15]
[ITEM4|{{Escaped Description}}|1|1|5131|NONE|15]
I have to reformat a certain column of certain rows to a ton of rows similar to that. Helping my first question would help actual project.
.+/m` If that is not sufficient, please **detail your requirements fully**.
– salathe Dec 13 '11 at 23:01(.+?)<\/p>_is'` might work if the html block structure is always similar to your shown example. Result in `[2]` and some prefix will remain as you cannot use an assertion for that. Otherwise you will need a recursive `(?R)` regex... (Add a bounty if you need that.) -- Using [QueryPath](http://stackoverflow.com/questions/tagged/QueryPath) would be so much simpler `htmlqp($html)->find("p")->not("div p");` or [SimpleHtmlDom](http://stackoverflow.com/questions/tagged/SimpleHtmlDom) for older PHP servers without DOM support.
– mario Dec 13 '11 at 23:48