18

I am consuming the Twitter API and want to convert all URLs to hyperlinks.

What is the most effective way you've come up with to do this?

from

string myString = "This is my tweet check it out http://tinyurl.com/blah";

to

This is my tweet check it out <a href="http://tinyurl.com/blah">http://tinyurl.com/>blah</a>
abatishchev
  • 98,240
  • 88
  • 296
  • 433
Nathan Birkes
  • 265
  • 1
  • 4
  • 9
  • For this application you should look for a solution which exactly matches how Twitter itself parses out URLs — a regular expression might work; just make sure use the same condition for matching the end of the URL (vs things like dots and right parentheses) as Twitter does. – Kevin Reid Apr 18 '10 at 11:17

5 Answers5

24

Regular expressions are probably your friend for this kind of task:

Regex r = new Regex(@"(https?://[^\s]+)");
myString = r.Replace(myString, "<a href=\"$1\">$1</a>");

The regular expression for matching URLs might need a bit of work.

Jeremy Cook
  • 20,840
  • 9
  • 71
  • 77
samjudson
  • 56,243
  • 7
  • 59
  • 69
  • 2
    I think that's fine, regular expressions are powerful, yet capturing while non-whitespace is a lot better than trying to implement an URL parser in regex. I would maybe change it to `(https?://[^ ]+)` becuase https is not that uncommon. – John Leidegren Apr 18 '10 at 07:47
7

I did this exact same thing with jquery consuming the JSON API here is the linkify function:

String.prototype.linkify = function() {
    return this.replace(/[A-Za-z]+:\/\/[A-Za-z0-9-_]+\.[A-Za-z0-9-_:%&\?\/.=]+/, function(m) {
        return m.link(m);
    });
 };
RedWolves
  • 10,379
  • 12
  • 49
  • 68
6

This is actually an ugly problem. URLs can contain (and end with) punctuation, so it can be difficult to determine where a URL actually ends, when it's embedded in normal text. For example:

http://example.com/.

is a valid URL, but it could just as easily be the end of a sentence:

I buy all my witty T-shirts from http://example.com/.

You can't simply parse until a space is found, because then you'll keep the period as part of the URL. You also can't simply parse until a period or a space is found, because periods are extremely common in URLs.

Yes, regex is your friend here, but constructing the appropriate regex is the hard part.

Check out this as well: Expanding URLs with Regex in .NET.

Derek Park
  • 45,824
  • 15
  • 58
  • 76
1

/cheer for RedWolves

from: this.replace(/[A-Za-z]+://[A-Za-z0-9-]+.[A-Za-z0-9-:%&\?/.=]+/, function(m){...

see: /[A-Za-z]+://[A-Za-z0-9-]+.[A-Za-z0-9-:%&\?/.=]+/

There's the code for the addresses "anyprotocol"://"anysubdomain/domain"."anydomainextension and address",

and it's a perfect example for other uses of string manipulation. you can slice and dice at will with .replace and insert proper "a href"s where needed.

I used jQuery to change the attributes of these links to "target=_blank" easily in my content-loading logic even though the .link method doesn't let you customize them.

I personally love tacking on a custom method to the string object for on the fly string-filtering (the String.prototype.linkify declaration), but I'm not sure how that would play out in a large-scale environment where you'd have to organize 10+ custom linkify-like functions. I think you'd definitely have to do something else with your code structure at that point.

Maybe a vet will stumble along here and enlighten us.

winfred
  • 3,053
  • 1
  • 25
  • 16
1

You can add some more control on this by using MatchEvaluator delegate function with regular expression: suppose i have this string:

find more on http://www.stackoverflow.com 

now try this code

private void ModifyString()
{
    string input = "find more on http://www.authorcode.com ";
                Regex regx = new Regex(@"\b((http|https|ftp|mailto)://)?(www.)+[\w-]+(/[\w- ./?%&=]*)?");
                string result = regx.Replace(input, new MatchEvaluator(ReplaceURl));
}

static string ReplaceURl(Match m)
{
    string x = m.ToString();
    x = "< a href=\"" + x + "\">" + x + "</a>";
    return x;
}
Andrew Barber
  • 39,603
  • 20
  • 94
  • 123
herry
  • 21
  • 1