0

I'm trying to make a replacement between a string containing a link, with the same string encapsulated in html href blocks. I'm new to regular expressions and have been reading up on them - I've come up with this expression going through SO and other sites.

$s = 'This is a stupid site: www.etsy.com';
$regEx = '#(^www\.|^http://)([a-zA-Z0-9/?\-&=_\.]+\.com|\.net|\.org|\.ca)|(/[a-zA-Z0-9/?\-_&=\.]+)#';
$ret = preg_replace( $regEx, "<a href='$1$2$3'>$1$2$3</a>", $s);
echo $ret;

This doesn't return me a link at all

and this doesn't include "http://" in the link:

$s = 'This is a stupid video http://www.youtube.com/watch?v=MkXVM6ad9nI';
$regEx = '#(^www\.|^http://)([a-zA-Z0-9/?\-&=_\.]+\.com|\.net|\.org|\.ca)|(/[a-zA-Z0-9/?\-_&=\.]+)#';
$ret = preg_replace( $regEx, "<a href='$1$2$3'>$1$2$3</a>", $s);
echo $ret;

I'm still trying so this might change... but any help would be appreciated as I'm nearing my wits end.

Thank you in advance for your time

P.S: I tried this in RegexBuddy and the whole string gets highlighted when I test... so I'm really wondering what I'm doing wrong.

wribit
  • 605
  • 1
  • 6
  • 18

2 Answers2

0

Try with:

$s = 'This is a stupid site: www.etsy.com';
$regEx = '#(www\.|http://)([a-zA-Z0-9\/?\-&=_\.]+\.com|\.net|\.org|\.ca)(/[a-zA-Z0-9/?\-_&=\.]*)?#';
$ret = preg_replace( $regEx, "<a href='$1$2$3'>$1$2$3</a>", $s);
echo $ret;

And test HERE

Croises
  • 18,570
  • 4
  • 30
  • 47
0

Your regex matches, for example:

www.whatever.com

or

www.net

but not

www.whatever.net

You have to group the TLDs:

$regEx = '#(^www\.|^http://)([a-zA-Z0-9/?\-&=_\.]+(\.com|\.net|\.org|\.ca))|(/[a-zA-Z0-9/?\-_&=\.]+)#';
//                                         here __^   and         here  __^

You could also simplify it:

$regEx = '#^((?:www\.|http://)[\w/.-]+\.(?:com|net|org|ca))#';
$ret = preg_replace( $regEx, "<a href='$1'>$1</a>", $s);

Where:

(?:...) is a non capture group.
\w stands for [a-zA-Z0-9_]

Toto
  • 89,455
  • 62
  • 89
  • 125