Regex for url formatting (www.domain.tld to anchors)

Question

I'm currently developing a little browser-based Twitter widget.

Currently, I'm stuck with getting the URLs to work. I'm kinda newbie, when it comes to regex (I know, how to get parts of a string, but this one – tough one).

So, I need a regex that would search/replace

www.domain.tld -> <a href="http://www.domain.tld">http://www.domain.tld</a>

With/without http://, preferably.

Any advice is welcome. Thanks.

score 0 · Answer 1 · answered May 19 '10 at 22:19

0

This is how far I've got:

www\.(?:\S*)\.(?:\S{2,3})

It checks for www. at beginning, any non-witespace chars and top level domain (2 or three chars).

answered May 19 '10 at 22:19

Kristaps

1

1

.info? .mobi? .museum? You should probably check for that. – May 19 '10 at 22:22
Actually, I could even just check for non-whitespace chars, as mostly URLs have parameters too (?param=value&etc=1). Of course, at start I'll need to sanitize the input, for anti-XSS measures. – Kristaps May 19 '10 at 22:32

score 0 · Answer 2 · answered May 19 '10 at 22:27

0

I'm in an ever going war against RegExes, I don't like them. So, do I'd do it like this instead:

function get_domain_from_anchor($anchor, $delimiter = '"') {
    return substr(strstr(strstr($anchor, $delimiter), $delimiter.'>', true), 8);
}

echo get_domain_from_anchor('<a href="http://www.domain.net">http://www.domain.net</a>');

// OUTPUTS: www.domain.net

Much better :D

answered May 19 '10 at 22:27

Sune Rasmussen

956
4
14

I'm sorry, but I need the opposite thing. I have to convert from plaintext URLs to html anchors. – Kristaps May 19 '10 at 22:30
@Kristaps: Your question is a little unclear, what that regards ;) – Sune Rasmussen May 19 '10 at 22:46

score 0 · Answer 3 · edited May 23 '17 at 12:19

0

I believe this is exactly what you're looking for: PHP validation/regex for URL

Some more information regarding extraction of URLs: Extract URLs from text in PHP

edited May 23 '17 at 12:19

Community

1
1

answered May 19 '10 at 22:33

Coding District

11,901
4
26
30

Thank you, I came up with ((?:http:\/\/|https:\/\/)(?:(?:[a-z0-9\&\.?=\-_\[\]\/])*)) Seems that'll work. Thank you! – Kristaps May 19 '10 at 22:54

score 0 · Answer 4 · answered May 19 '10 at 23:08

Try twitter-text-php. It is ported to PHP from the official Twitter code.

From the README file:

$autolinker = new Twitter_Autolink();
$html = $autolinker->autolink("Tweet mentioning @mikenz and refuring to his list @mikeNZ/sports and website http://mikenz.geek.nz");
echo $html;

Regex for url formatting (www.domain.tld to anchors)

4 Answers4