I need to extract some information from html code, I have these two structures:
<p>Street 1a</p>
<p>12345 Berlin</p>
and
<p>
Street 1a
<br>
12345 Berlin
</p>
My question is how to extract the string 'Street 1a' from both structures with one method.
I thought about writing a method for every possible html-sturcure, but this is far to much work. I also thought about parsing the whole html-code and do pattern matching but is also not very elegant, like:
$xml = new DOMDocument();
libxml_use_internal_errors(true);
// Load the url's contents into the DOM
$xml->loadHTMLFile($url);
libxml_clear_errors();
// pattern matching now
Anybody has some experience with this?
Greetings and thanks!