-2

I found a problem with my homework on how to get the URL value from html using php. I tried a website to try my code, but i need get some URL with pattern (specific result) example : https: //video.xxxxxxx/

my code :

$regexp = "/<a\s[^>]*href=([\"\']??)([^\\1 >]*?)\\1[^>]*>(.*)<\/a>/siU";
  if(preg_match_all("$regexp", $data, $matches, PREG_SET_ORDER)) {
    foreach($matches as $match) {
      echo $match[0];
    }
  }
Jnck ID
  • 39
  • 4
  • 2
    please be more specific, as it is currently written, your post may never have any answer – Pierre Feb 11 '20 at 12:53
  • Please include some sample inputs and their desired outputs along with the question. – Robo Mop Feb 11 '20 at 14:47
  • @Pierre I've edited my answer multiple times in a few minutes, so please refer to its most recent version, which is the one right now. – Robo Mop Feb 11 '20 at 15:05

1 Answers1

0

You can try this:

<a.*?href\s*=\s*([\"\'])(.*?)\1.*?>.*?<\/a>

As seen here

I've never used PHP before, so you might have to use \\1 instead of \1

Explanation:

It's tedious to explain every single element of this, so I'll give you a general idea. First you match the a tag, followed by any number of characters, styles, or different attributes, then followed by href=. Here, we start the capturing group 1, which contains your ' or ". Capturing group 2 contains your website's url without the quotations. Then we use \1 to refer to the type of quotation first used.

If you want the text within the a tag, for whatever reason, you can refer to it using \3

Do note: You'll need to use match[2] instead of match[0]

Community
  • 1
  • 1
Robo Mop
  • 3,485
  • 1
  • 10
  • 23