If the data are separated by tags like I could see DOM/parsers but I'm picking info from a block of text. I already know that what I need is regex. This question is asking WHY my regex is not working. And also no one has identified a better option other than regex.
So I loaded the html from this link: http://www.saq.com/webapp/wcs/stores/servlet/SAQStoreLocatorSearchResultsStoreDetailsView?storeLocationId=10180&catalogId=50000&langId=-1&storeIdentifier=23165&storeId=20002
Only interested in this bit:
<h2 class="titre filet-bottom3">Coordinates</h2>
<p>
585, avenue St-Charles<br />
Vaudreuil-Dorion, Québec<br />
J7V8P9
</p>
<p>
Phone number : 450 455-9347 <br />
Fax : 450 455-5852
</p>
Task is to extract info such as phone number,address, city, etc. I decided on using regex because it was working for other bits in this HTML. But for this block of text it didn't take.
So this the regex I...threw up on regex101.com
/Coordinates<\/h2>\s+<p>\s+(.+), (\D+)<br \/>\s+(\D+),\s+(\D+)<br \/>\s+(\D\d\D\d\D\d)\s+<\/p>\s+<p>\s+P.+;:.(\d{3} \d{3}-\d{4}).+\s+F.+;:(.+|.+(\d{3} \d{3}-\d{4}))/gi
and it works on regex101.com, as in the capturing groups extracted the info i wanted.
But when I put in on PHP using this:
$regex = '/Coordinates<\/h2>\s+<p>\s+(.+), (\D+)<br \/>\s+(\D+),\s+(\D+)<br \/>\s+(\D\d\D\d\D\d)\s+<\/p>\s+<p>\s+P.+;:.(\d{3} \d{3}-\d{4}).+\s+F.+;:(.+|.+(\d{3} \d{3}-\d{4}))/gi';
preg_match($regex, $data, $match);
I get no match. I was able to extract other info such as map coordinates using this method. Is there a better way to do this? If not why isn't it working.
Thanks!