I am building a scraper using Simple HTML Dom. I got it to work with emails that are no issue although I need to work out how to make it so it only looks for emails & numbers inside the body tag. The main issue I am having though is getting it to find phone numbers with Regex, I have tried many combinations such as /^(1?(-?\d{3})-?)?(\d{3})(-?\d{4})$/
and it finds the phone number on some but not all, but most of the time it doesn't find any or finds numbers inside of URLs.
What I need help with:-
- Getting Simple-HTML-DOM to only find things inside of the body tag.
- Getting regex to find telephone numbers
I've tried many RegEx variations and checked the manuals
$website = "https://www.altech-uk.com/contact-us/index.htm";
$context = stream_context_create(
array(
'http' => array(
'follow_location' => false
)
)
);
$html = file_get_contents("$website", false, $context);
$regp = '/^(1?(-?\d{3})-?)?(\d{3})(-?\d{4})$/';
preg_match_all($regp, $html, $phonematch, PREG_SET_ORDER, 0);
$P = 0;
foreach($phonematch as $resultp) {
echo $resultp[$P];
$P++;
}
$html->clear();
unset($html);
It's bringing up either no phone number or the wrong numbers from outside of the body tag and every solution i have found does no work as its only for American numbers.