Replacing Strings in php

Question

Considering the following input HTML:

<div class='content'>
    <img style='border-style: solid; border-width: 1px;' src='/media/uploads/defaults/181'/><br/><br/>
    <div class='imgCaption'>
        Reverse Osmosis Caption
    </div>
</div>
<pagebreak/>
<h3>Access </h3>
<h4>Type</h4>
<div class='content'>
    Your plumbing system is accessible with a Main Shut off Valves
</div>
<h4>Location</h4>
<pagebreak/>
<h3>Operation & maintenance #1</h3>
<div class='content'>
    All wastewater treatment systems and their components require regular maintenance.
</div>
<h4>Activity</h4>

So I need to find all h4 headers that are not followed by the div of class "content". (In this example, it's "h4 Activity /h4" at the very bottom).

My regex

/<h4>.*<\/h4>(?!<div class='content'>)/

captures everything after

<h4>Type</h4>

Which makes sense since it's followed by not just "div class='content'".

So my question is how I can re-write the query so it only picks up the headers that are not followed by div of class content.

Avinash Raj · Accepted Answer · 2014-12-27T10:30:36.570

1

You need to add .*? at the first inside the negative lookahead assertion. If you fail to add .*?, negative lookahead will check for the immediate following of <div class='content'> tag.

<h4>(?:(?!<\/?h4>).)*?<\/h4>(?!.*?<div class='content'>)

DEMO

It will match the last h4 tag because it isn't followed by any <div class='content'> tag.

edited Dec 27 '14 at 10:30

answered Dec 27 '14 at 10:24

Avinash Raj

172,303
28
230
274

Replacing Strings in php

1 Answers1