6

I know I've seen this done a lot in places, but I need something a little more different than the norm. Sadly When I search this anywhere it gets buried in posts about just making the link into an html tag link. I want the PHP function to strip out the "http://" and "https://" from the link as well as anything after the .* so basically what I am looking for is to turn A into B.

A: http://www.youtube.com/watch?v=spsnQWtsUFM
B: <a href="http://www.youtube.com/watch?v=spsnQWtsUFM">www.youtube.com</a>

If it helps, here is my current PHP regex replace function.

ereg_replace("[[:alpha:]]+://[^<>[:space:]]+[[:alnum:]/]", "<a href=\"\\0\" class=\"bwl\" target=\"_new\">\\0</a>", htmlspecialchars($body, ENT_QUOTES)));

It would probably also be helpful to say that I have absolutely no understanding in regular expressions. Thanks!

EDIT: When I entered a comment like this blahblah https://www.facebook.com/?sk=ff&ap=1 blah I get html like this<a class="bwl" href="blahblah https://www.facebook.com/?sk=ff&amp;ap=1 blah">www.facebook.com</a> which doesn't work at all as it is taking the text around the link with it. It works great if someone only comments a link however. This is when I changed the function to this

preg_replace("#^(.*)//(.*)/(.*)$#",'<a class="bwl" href="\0">\2</a>',  htmlspecialchars($body, ENT_QUOTES));
Brian Leishman
  • 8,155
  • 11
  • 57
  • 93
  • 2
    Always prefer `preg*` instead of `ereg*` functions, since the `ereg*` functions are slow and deprecated. – rid Jun 18 '11 at 04:50
  • possible duplicate of [How to add anchor tag to a URL from text input](http://stackoverflow.com/questions/1959062/how-to-add-anchor-tag-to-a-url-from-text-input) – outis Mar 11 '12 at 00:37

7 Answers7

5

This is the simples and cleanest way:

$str = 'http://www.youtube.com/watch?v=spsnQWtsUFM';
preg_match("#//(.+?)/#", $str, $matches);

$site_url = $matches[1];

EDIT: I assume that the $str had been checked to be a URL in the first place, so I left that out. Also, I assume that all the URLs will contain either 'http://' or 'https://'. In case the url is formatted like this www.youtube.com/watch?v=spsnQWtsUFM or even youtube.com/watch?v=spsnQWtsUFM, the above regexp won't work!

EDIT2: I'm sorry, I didn't realize that you were trying to replace all strings in a whole test. In that case, this should work the way you want it:

$str = preg_replace('#(\A|[^=\]\'"a-zA-Z0-9])(http[s]?://(.+?)/[^()<>\s]+)#i', '\\1<a href="\\2">\\3</a>', $str);
Battle_707
  • 708
  • 5
  • 15
2

I am not a regex whizz either,

^(.*)//(.*)/(.*)$
<a href="\1//\2/\3">\2</a>

was what worked for me when I tried to use as find and replace in programmer's notepad.

^(.)// should extract the protocol - referred as \1 in the second line. (.)/ should extract everything till the first / - referred as \2 in the second line. (.*)$ captures everything till the end of the string. - referred as \3 in the second line.


Added later

^(.*)( )(.*)//(.*)/(.*)( )(.*)$
\1\2<a href="\3//\4/\5">\4</a> \7

This should be a bit better, but will only replace just 1 URL

Lord Loh.
  • 2,437
  • 7
  • 39
  • 64
  • 1
    This will work just fine (if checked to be valid URL before calling this). As a proper PHP, this would be **preg_replace("#^(.*)//(.*)/(.*)$#",'\2', $str)**, where \0 is the entire matched string. – Battle_707 Jun 18 '11 at 04:35
  • @stumpx: I am not sure why you selected this answer to be the correct one, but after realizing that in your situation 1) the $str value has not been checked to be a valid URL and 2) you want to replace ALL URLs in the $str, this code won't work the way you want at all. It will, first of all, not work on just http(s) links, but also ftp(s) or irc (for example). Also, it will return ONLY the HTML formatted link of the last occurring link in $str, not the rest of the string (in any shape or form). – Battle_707 Jun 18 '11 at 04:54
  • Actually this did not work. When I entered a comment like this `blahblah https://www.facebook.com/?sk=ff&ap=1 blah` I get html like this `www.facebook.com` which doesn't work at all. It works great if someone only comments a link however – Brian Leishman Jun 18 '11 at 04:56
  • Okay, so you have the URL inside a comment... I posted the expression assuming that the string had just the URL. In that case try to find '^(.*)( )(.*)//(.*)/(.*)( )(.*)$' '\1\2\4 \7' This should be a bit better, but will only replace just 1 URL. – Lord Loh. Jun 18 '11 at 05:08
0

The code with regex does not work completely.

I made this code. It is much more comprehensive, but it works:

See the result here: http://cht.dk/data/php-scripts/inc_functions_links.php

See the source code here: http://cht.dk/data/php-scripts/inc_functions_links.txt

0

The \0 is replaced by the entire matched string, whereas \x (where x is a number other than 0 starting at 1) will be replaced by each subpart of your matched string based on what you wrap in parentheses and the order those groups appear. Your solution is as follows:

ereg_replace("[[:alpha:]]+://([^<>[:space:]]+[:alnum:]*)[[:alnum:]/]", "<a href=\"\\0\" class=\"bwl\" target=\"_new\">\\1</a>

I haven't been able to test this though so let me know if it works.

JMTyler
  • 1,604
  • 2
  • 21
  • 24
  • 1
    This function has been depricated in PHP 5.3.0. It would not be smart to use this function anymore. On top of that, the expression is far more complicated than it needs to be. – Battle_707 Jun 18 '11 at 04:28
  • That did not seem to work, it actually still just used the whole link and changing the number to 2 just gave me a 2 in the output – Brian Leishman Jun 18 '11 at 04:29
  • Ahh i didn't even see your first comment, thanks for explaining this though, I didn't even realize – Brian Leishman Jun 18 '11 at 04:35
0

I think this should do it (I haven't tested it):

preg_match('/^http[s]?:\/\/(.+?)\/.*/i', $main_url, $matches);
$final_url = '<a href="'.$main_url.'">'.$matches[1].'</a>';
Tudor Constantin
  • 26,330
  • 7
  • 49
  • 72
  • This won't work on https links. Also, there might not be a need to check if it's a URL in the first place, so the last part of the regexp (/.*) isn't really needed. Lastly, since the forward slash is being used so intensively in the expression, it would be smarter to use a different expression delimiter, such as ; or #. – Battle_707 Jun 18 '11 at 04:26
0

I'm surprised no one remembers PHP's parse_url function:

$url = 'http://www.youtube.com/watch?v=spsnQWtsUFM';
echo parse_url($url, PHP_URL_HOST); // displays "www.youtube.com"

I think you know what to do from there.

Mark Eirich
  • 10,016
  • 2
  • 25
  • 27
  • Of course...kind of forgot about this. I suppose I have grown so used to preg_match/ preg_replace x). Anyway, parse_url will require much more lines of code. I don't know how it will benchmark against preg_replace, but I imagine that, given the fact that PHP needs to build arrays, and you probably need to use a preg_match_all to fetch all of the URLs in a text in the first place, it's not going to outperform the preg_replace function. – Battle_707 Jun 18 '11 at 05:17
  • Yeah, at first I didn't realize he was doing a search and replace in a document. I thought he was just processing a single URL.... – Mark Eirich Jun 18 '11 at 05:18
0

$result = preg_replace('%(http[s]?://)(\S+)%', '<a href="\1\2">\2</a>', $subject);

daalbert
  • 1,465
  • 9
  • 7
  • ereg_replace has been depricated since PHP 5.3.0. It would be unwise to use this function now. – Battle_707 Jun 18 '11 at 05:14
  • @Battle_707, you are correct, I was in autopilot mode and just used the same function the poster used w/o thinking about it. I updated my answer w/ preg instead. – daalbert Jun 20 '11 at 15:59