0

I am building a scraper using Simple HTML Dom. I got it to work with emails that are no issue although I need to work out how to make it so it only looks for emails & numbers inside the body tag. The main issue I am having though is getting it to find phone numbers with Regex, I have tried many combinations such as /^(1?(-?\d{3})-?)?(\d{3})(-?\d{4})$/ and it finds the phone number on some but not all, but most of the time it doesn't find any or finds numbers inside of URLs.

What I need help with:-

  • Getting Simple-HTML-DOM to only find things inside of the body tag.
  • Getting regex to find telephone numbers

I've tried many RegEx variations and checked the manuals

        $website = "https://www.altech-uk.com/contact-us/index.htm";

        $context = stream_context_create(
                array(
                'http' => array(

                    'follow_location' => false

                )
            )
        ); 

        $html = file_get_contents("$website", false, $context);  
        $regp = '/^(1?(-?\d{3})-?)?(\d{3})(-?\d{4})$/';
        preg_match_all($regp, $html, $phonematch, PREG_SET_ORDER, 0); 
        $P = 0;
        foreach($phonematch as $resultp) {  

            echo $resultp[$P]; 
            $P++;

        } 

        $html->clear();
        unset($html);

It's bringing up either no phone number or the wrong numbers from outside of the body tag and every solution i have found does no work as its only for American numbers.

simong93
  • 13
  • 3
  • Possible duplicate of [A comprehensive regex for phone number validation](https://stackoverflow.com/questions/123559/a-comprehensive-regex-for-phone-number-validation) – freeek Sep 25 '19 at 16:53
  • That one doesn't work, it seems to be for American phone numbers – simong93 Sep 25 '19 at 18:16
  • It is general idea, that applies to all phone numbers. If you specify your problem more general, may be we can help you. – freeek Sep 26 '19 at 07:35

0 Answers0