extracting links in regex php

Question

I'm having issues extracting links in regex using preg_match_all().

I have the following string:

some random text <a href="http://localhost/example/wp-content/uploads/2014/07/Link1.pdf\">Link1</a><a href="http://localhost/example/wp-content/uploads/2014/07/Link2.pdf\">Link2</a>

I would like to extract the link to the files and the files format into two separate variables.

Any regex gurus here? I've been struggling with this all day.

Thanks!

score 1 · Accepted Answer · edited Sep 28 '14 at 12:32

1

(?<=href=")(.*?\.(.*?))\\

Try this.just grab the captures.see demo.

http://regex101.com/r/lS5tT3/80

$data = 'some random text <a href="http://localhost/example/wp-content/uploads/2014/07/Link1.pdf\">Link1</a><a href="http://localhost/example/wp-content/uploads/2014/07/Link2.pdf\">Link2</a>"';
$regex =  '/(?<=href=")(.*?\.(.*?))\\\\/';
preg_match_all($regex, $data, $matches);
print_r($matches);

Output:

Array
(
    [0] => Array
        (
            [0] => http://localhost/example/wp-content/uploads/2014/07/Link1.pdf\
            [1] => http://localhost/example/wp-content/uploads/2014/07/Link2.pdf\
        )

    [1] => Array
        (
            [0] => http://localhost/example/wp-content/uploads/2014/07/Link1.pdf
            [1] => http://localhost/example/wp-content/uploads/2014/07/Link2.pdf
        )

    [2] => Array
        (
            [0] => pdf
            [1] => pdf
        )

)

edited Sep 28 '14 at 12:32

Avinash Raj

172,303
28
230
274

answered Sep 28 '14 at 11:38

vks

67,027
10
91
124

That's exactly what I need. But for some odd reason using this expression in preg_match_all() doesn't work for me.. It returns NULL. Any ideas? – nevos Sep 28 '14 at 11:46
@Nevos use `groups()`, cpature `group1` for link ,`group2` for format – vks Sep 28 '14 at 11:47
What's `groups()`? I can't find reference to it – nevos Sep 28 '14 at 11:53
1

@Nevos http://stackoverflow.com/questions/3459721/regex-group-in-perl-how-to-capture-elements-into-array-from-regex-group-that-ma – vks Sep 28 '14 at 11:58
@AvinashRaj thanx a lot.but why are the two links appearing two times – vks Sep 28 '14 at 12:28
array 0 - matched string. Array 1 -group 1. Array 2- group 2. – Avinash Raj Sep 28 '14 at 12:32
@AvinashRaj was `match_all` nacessary?`$1` `$2` cant they be used directly somehow? – vks Sep 28 '14 at 12:35
replace the last line with `print_r($matches[1]);\n print_r($matches[2]);` – Avinash Raj Sep 28 '14 at 12:41

extracting links in regex php

1 Answers1