0

I'm using preg_match_all method to get the urls from inside the anchor tag on a page. It works but when i'm getting them, before adding them to the array i would like to wrap them with '(like this 'url'):

preg_match_all('!<a href="(.*?)">!', $anchors, $urls);

Is there a way to do that? If yes, can you point me towards the right direction and the proper way that this could be done?

Thank you! :D

emma
  • 761
  • 5
  • 20

1 Answers1

2

Instead of using a regex to parse html you could use DOMDocument and getElementsByTagName

$dom = new DOMDocument;
$dom->loadHTMLFile("yourfile.html");
$anchors= $dom->getElementsByTagName("a");
$hrefs = [];

foreach ($anchors as $anchor) {
    if ($anchor->hasAttribute("href")) {
        $hrefs[] = "'{$anchor->getAttribute('href')}'";
    }
}
The fourth bird
  • 154,723
  • 16
  • 55
  • 70