0

Why doesn't the following code shorten this URL? And why doesn't it turn it into an actual clickable URL? This function seems to work in all other cases but this one.

URL:

strongatheism.net/library/atheology/argument_from_noncognitivism/

Code:

function urlfixer($text){

   $pattern  = '#\b(([\w-]+://?|www[.])[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/)))#';
   $callback = create_function('$matches', '
       $url       = array_shift($matches);      
       $url_parts = parse_url($url);

       $text = parse_url($url, PHP_URL_HOST) . parse_url($url, PHP_URL_PATH);
       $text = preg_replace("/^www./", "", $text);

       $last = -(strlen(strrchr($text, "/"))) + 1;
       if ($last < 0) {
           $text = substr($text, 0, $last) . "&hellip;";
       }

        $url = "http://" . str_replace("http://","",$url);
       return sprintf(\'<a rel="nofollow" target="_blank" href="%s">%s</a>\', $url, $text);
   ');

   return preg_replace_callback($pattern, $callback, $text);
}
Tom
  • 917
  • 2
  • 12
  • 23

1 Answers1

0

I have problems to answer your question because depending for what you ask for I see two answers:

  1. Because the regular expression does not capture it.
  2. Because it is not considered a valid URL in context of the function.

For that to properly work you either need to properly define what an URL constitutes (here in form of the regular expression pattern) or you need to define it in your own specification (which is missing in the question).

Good code with a complex regular expression always contains a description what the regular expression exactly does because they tend to become cryptic. Such comments could also work well as a little specification what qualifies valid input. Code could look like (example taken from youtube video ID):

$pattern = 
    '%^# Match any youtube URL
    (?:https?://)?  # Optional scheme. Either http or https
    (?:www\.)?      # Optional www subdomain
    (?:             # Group host alternatives
      youtu\.be/    # Either youtu.be,
    | youtube\.com  # or youtube.com
      (?:           # Group path alternatives
        /embed/     # Either /embed/
      | /v/         # or /v/
      | /watch\?v=  # or /watch\?v=
      )             # End path alternatives.
    )               # End host alternatives.
    ([\w-]{10,12})  # Allow 10-12 for 11 char youtube id.
    $%x'
    ;

As your question lacks of what constitutes a valid URL (remains unspecified), there is not much more to answer than add specification or fix the pattern (or both).

The second question however is more easy to answer:

And why doesn't it turn it into an actual clickable URL?

Because it is not captured.

Community
  • 1
  • 1
M8R-1jmw5r
  • 4,896
  • 2
  • 18
  • 26