Match first Character of the same

Question

How can I use this:

<div class="content"><div>Content</div></div>

And match this:

<div>Content</div>

I used this regex but it doesn't work, because it matches the last div

/<div\s?(.*)>(.*)<\/div>/

Depending on what you're doing, you don't need Regex, and on the same note, you **should not** be using regex to parse html: Read [this answer](http://stackoverflow.com/a/1732454/3296811) for more information on that. Consider using `DOM` manipulation instead. — Quill, Oct 27 '15 at 23:21

score 0 · Answer 1 · answered Oct 27 '15 at 23:25

As @Quill said, regex may not be the best option for you, but if you were somehow parsing an HTML string for something, you could just change the regex so that it matches the first closing </div> by editing the (.*) to select everything up to the next angle bracket (only that div):

/<div\s?([^>]*)>([^<]*)<\/div>/

I also edited the selector for the <div> attributes.

This is the easiest way to do this. To make it more advanced, you could use lookahead regex.

score 0 · Answer 2 · answered Oct 27 '15 at 23:32

Using regex for HTML parsing in this case is not appropriate because you deal with recursive structures. Use DOMDocument:

$html = '<div class="content"><div>Content</div></div>';

$dom = new DOMDocument('1.0', 'UTF-8');
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);

$xpath = new DOMXPath($dom);
$divs = $xpath->query('//div[@class="content"]'); // Get all DIV tags with "class" attribute with "content" as its value

foreach($divs as $div) { 

   foreach ($div->childNodes as $childNode) {
       echo $dom->saveHTML($childNode);
   }
}

See IDEONE demo

Result: <div>Content</div>

You will need some tweaks if your input contains invalid HTML, but I guess it is not the case.

Match first Character of the same

2 Answers2