1

I use the following to find all URL´s inside $content

 $content = preg_match_all( '/(http[s]?:[^\s]*)/i', $content, $links );

But this will depend on the http:// part in http://www.google.com/some/path .

My questions are :

1 - How can I modify it in order to hit also the links that are start with only www , e.g. www.google.com?

2 - The main aim is to find the links, and replace them with a value that is returned from another function. I tried preg_match_callback() , but it is not working (probably using it wrong ..

$content = preg_replace_callback(
           "/(http[s]?:[^\s]*)/i",
            "my_callback",
            $content);

function my_callback(){

// do a lot of stuff independently of preg_replace
// adding to =.output...

return $output;
}

Now , in my logic (which is probably wrong ) all matches from the $content would be replaced by $output. what am I doing wrong ?

(please no anonymous functions - i am testing on an old server)

EDIT I - after comments , trying to clarify with more details

function o99_simple_parse($content){

$content = preg_replace_callback( '/(http[s]?:[^\s]*)/i', 'o99_simple_callback', $content );


return $content;
}

callback :

function o99_simple_callback($url){
    // how to get the URL which is actually the match? and width ??
        $url = esc_url_raw( $link );
        $url_name = parse_url($url); 
        $url_name = $description = $url_name['host'];// get rid of http://..
        $url = 'http://something' .  urlencode($url)   . '?w=' . $width ; 
        return $url; // what i really need to replace 
    }
Obmerk Kronen
  • 15,619
  • 16
  • 66
  • 105
  • check this: http://stackoverflow.com/questions/1755144/how-to-validate-domain-name-in-php , especially the velcrow's response. – Gaël Barbin Mar 27 '13 at 02:41
  • thanks, but it is ignoring HTTP and HTTPS url´s ? . also, there is no info on the callback. basically it is concatenating 2 regexes , no ? – Obmerk Kronen Mar 27 '13 at 02:45

1 Answers1

3

To modify the regex you already have to allow URLs that begin with www, you'd simply write this:

/((http[s]?:|www[.])[^\s]*)/i
  +         ++++++++
Andrew Cheong
  • 29,362
  • 15
  • 90
  • 145
  • Thanks ! I will check that .. what about the second part of the callback ? am i using it wrong ? – Obmerk Kronen Mar 27 '13 at 03:07
  • I'm not sure, because you didn't include your code. `my_callback` should accept a parameter however, like this: `my_callback($matches)`, and you'd use the items in `$matches` to construct your output. – Andrew Cheong Mar 27 '13 at 03:15
  • 1
    You seem to be misunderstanding the callback mechanism. Whatever callback you define, PHP's _internals_ will pass the parameter(s). Whether you _name_ the parameter `$matches` or `$url` doesn't matter, but in the case of `preg_replace_callback`, PHP's internals will pass a _single_ parameter: an _array_ of capture groups. In your code, in `o99_simple_callback`, you're calling this array `$url` and operating on it as if it were a string, but that's incorrect. The URL you're looking for is actually in `$url[0]` (as well as `$url[1]`). – Andrew Cheong Mar 27 '13 at 12:30
  • thanks ! that clarifies some (+1).. but I thought that the way the callback works is that EACH "hit" or match will go to the callback and then come back to the function. otherwise I fail to see the difference between just using the preg_match_all() and the preg_replace_callback().. but anyhow maybe it is stuff for another question .. Thanks for the help .. – Obmerk Kronen Mar 28 '13 at 02:00