1

I'm having issues extracting links in regex using preg_match_all().

I have the following string:

some random text <a href="http://localhost/example/wp-content/uploads/2014/07/Link1.pdf\">Link1</a><a href="http://localhost/example/wp-content/uploads/2014/07/Link2.pdf\">Link2</a>

I would like to extract the link to the files and the files format into two separate variables.

Any regex gurus here? I've been struggling with this all day.

Thanks!

Avinash Raj
  • 172,303
  • 28
  • 230
  • 274
nevos
  • 907
  • 1
  • 10
  • 22

1 Answers1

1
(?<=href=")(.*?\.(.*?))\\

Try this.just grab the captures.see demo.

http://regex101.com/r/lS5tT3/80

$data = 'some random text <a href="http://localhost/example/wp-content/uploads/2014/07/Link1.pdf\">Link1</a><a href="http://localhost/example/wp-content/uploads/2014/07/Link2.pdf\">Link2</a>"';
$regex =  '/(?<=href=")(.*?\.(.*?))\\\\/';
preg_match_all($regex, $data, $matches);
print_r($matches);

Output:

Array
(
    [0] => Array
        (
            [0] => http://localhost/example/wp-content/uploads/2014/07/Link1.pdf\
            [1] => http://localhost/example/wp-content/uploads/2014/07/Link2.pdf\
        )

    [1] => Array
        (
            [0] => http://localhost/example/wp-content/uploads/2014/07/Link1.pdf
            [1] => http://localhost/example/wp-content/uploads/2014/07/Link2.pdf
        )

    [2] => Array
        (
            [0] => pdf
            [1] => pdf
        )

)
Avinash Raj
  • 172,303
  • 28
  • 230
  • 274
vks
  • 67,027
  • 10
  • 91
  • 124