0

I need to extract the url inside the string.

In my case html text is in the db and when i get that text and need to find all url in the text and insert in to another table, can u give me a way to find the url's in SQL or C#.

2 Answers2

1

This is reqular expression to find urls in text

Regex regx = new Regex("http://([\\w+?\\.\\w+])+([a-zA-Z0-9\\~\\!\\@\\#\\$\\%\\^\\&\\*\\(\\)_\\-\\=\\+\\\\\\/\\?\\.\\:\\;\\'\\,]*)?", RegexOptions.IgnoreCase);

MatchCollection mactches = regx.Matches(txt); 
vborutenko
  • 4,323
  • 5
  • 28
  • 48
0

One of the possible ways to do it is by using Regular expressions. First option is to extract HTML from the DB, then use Regular Expression to find the links directly. The second option is to locate link tags first, then extract url from them (again by using Regular expressions).

Here you can find information about how to use Regular Expressions in C#: http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.aspx

On the other hand, writing the correct Regular Expression may not be so easy (it depends on how complex the URL is), but you should take a look at this question: regular expression for url

Also, here you can find a lot of information about regular expressions in general (keep in mind that there are some applications like RegexBuddy, that can help you a lot when it comes to testing your regular expressions): http://www.regular-expressions.info/

Community
  • 1
  • 1