How can I use this:
<div class="content"><div>Content</div></div>
And match this:
<div>Content</div>
I used this regex but it doesn't work, because it matches the last div
/<div\s?(.*)>(.*)<\/div>/
How can I use this:
<div class="content"><div>Content</div></div>
And match this:
<div>Content</div>
I used this regex but it doesn't work, because it matches the last div
/<div\s?(.*)>(.*)<\/div>/
As @Quill said, regex may not be the best option for you, but if you were somehow parsing an HTML string for something, you could just change the regex so that it matches the first closing </div>
by editing the (.*)
to select everything up to the next angle bracket (only that div):
/<div\s?([^>]*)>([^<]*)<\/div>/
I also edited the selector for the <div>
attributes.
This is the easiest way to do this. To make it more advanced, you could use lookahead regex.
Using regex for HTML parsing in this case is not appropriate because you deal with recursive structures. Use DOMDocument
:
$html = '<div class="content"><div>Content</div></div>';
$dom = new DOMDocument('1.0', 'UTF-8');
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new DOMXPath($dom);
$divs = $xpath->query('//div[@class="content"]'); // Get all DIV tags with "class" attribute with "content" as its value
foreach($divs as $div) {
foreach ($div->childNodes as $childNode) {
echo $dom->saveHTML($childNode);
}
}
See IDEONE demo
Result: <div>Content</div>
You will need some tweaks if your input contains invalid HTML, but I guess it is not the case.