I need to capture specific tags from a HTML page using PHP.
A single HTML document can have multiple results (Multiline as well). Also ONLY need to match tags if it includes a data-uid
value.
- Tag name (div, span etc...)
- data-uid's value
- Children nodes.
So far, I was able to capture tag name, data-uid's value. But not Children nodes.
<div class="testClassOne" data-uid="123456">
<div class="testClassTwo">Content</div>
<-- More nodes -->
</div>
Result: { tag: "div", data-uid: 123456, childrens: "<div class="testClassTwo">Content</div>
" }
or
<div class="testClassOne" data-uid="123456"></div>
Result: { tag: "div", data-uid: 123456, childrens: " " }
My current Regex and the function are as follow...
$regex = '/<(.*) (?:.*?)data-uid="([^"]*?)"(?:.*?)>(.*?)<\/\1>/';
$content = preg_replace_callback($regex, 'test', $content);
function test($arg){
print_r($arg);
}
Does anyone know to resolve this issue (Capture childrens as a string as well?) ?