0

I am using regex in my PHP script to check a page for Rapidshare links, and load them into an array.

My code:

if(preg_match_all('/http:\/\/rapidshare\.com\/files\/.*?\/[^\s]+/', $links[0], $links))
{
    print_r($links);
} else {
    die('Cannot find post links :(');
}

It finds the links correctly, and puts them into an array:

Array
(
    [0] => Array
        (
            [0] => http://rapidshare.com/files/320708377/file_name1.rar
            [1] => http://rapidshare.com/files/320708377/file_name1.rar
            [2] => http://rapidshare.com/files/333708133/file_name2.rar
            [3] => http://rapidshare.com/files/333708133/file_name2.rar
            [4] => http://rapidshare.com/files/330738827/file_name3.rar
            [5] => http://rapidshare.com/files/330738827/file_name3.rar
        )

)

As you can see, it enters two links into the array for each one. I have no clue why it's doing this but I suspect it's something to do with the regex.

Any help? Cheers. :)

mopoke
  • 10,555
  • 1
  • 31
  • 31
Matt
  • 1,083
  • 2
  • 10
  • 15

4 Answers4

1

Just for the record:

$array = array_unique($values); 

It won't work for multi-dimensional arrays though.. so you would have to for each through the first array.

Tyler Carter
  • 60,743
  • 20
  • 130
  • 150
  • Not necessarily, see http://stackoverflow.com/questions/1247950/how-to-remove-duplicated-2-dimension-array-in-php/1248189#1248189 – Alix Axel Jan 06 '10 at 06:01
1

preg_match_all() will not magically duplicate URLs, they must be occurring 2 times each. Are you using the regex on a string of HTML? I suspect that if there's an <a> tag you're capturing both the href and the actual link text:

<a href="http://www.example.com">http://www.example.com</a>
John Rasch
  • 62,489
  • 19
  • 106
  • 139
0

sigh Happens because it's a hyperlink and it's grabbing the URL it loads to and the link text.

Matt
  • 1,083
  • 2
  • 10
  • 15
0

preg-match-all Can subject and matches not using same variable name?

It is too confusing.

Also. give us the content of $links

Tommy
  • 1,960
  • 1
  • 19
  • 32