1

I'm trying to change a regex that will match a url like http://www.google.com, and also allow it to match a folder name such as j:\Folder\Name\Here

I'm parsing the text of a message for any links that may be present within, and creating a Process.Start(string) call with the matched string.

The regex I have now looks like this:

(?i)\b((?:[a-z][\w-]+:(?:/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'"".,<>?«»“”‘’]))

I'm thought I could add to the /{1,3} part to match \{1,1} also and it might work, but that doesn't seem to be the case.

I'm not sure what else the regex is exactly doing because I did not write it myself.

Does anyone have a working example already of a regex that will match URLs as well as file system folder paths? Or is there some way to change this existing regex to work for that purpose?

Zack
  • 2,789
  • 33
  • 60
  • 4
    I would recommend you to work with 2 regex's: one for internet URLs, one for file system path. Having both in one expression will make it eager (slow processing) and much harder to understand and maintain. – Andre Calil May 09 '13 at 18:23
  • 3
    I recommend you do not use regexes at all; use separate code paths for URLs, using the `Uri` class, and file paths, using the `FileInfo` class. These classes already handle parsing, matching, extracting components, and so on. – Dour High Arch May 09 '13 at 18:41
  • @AndreCalil I thought about that while driving away for lunch. It makes so much sense to use 2 regexs, and they would each have their specific purpose. This is what I'll try to implement. – Zack May 09 '13 at 19:21
  • 1
    @ZackT. Nice. I believe you'll find these regex's easily on the internet. Let us know if you have any further question. – Andre Calil May 09 '13 at 19:29
  • You may find [this](http://www.regexlib.com/Search.aspx?k=folder) site useful. Also see [this answer](http://stackoverflow.com/questions/1141848/regex-to-match-url) for URLs. My advice is to have two regexes, one for folders and one for URLs, then combine them into one. – Victor Zakharov May 09 '13 at 21:11
  • @DourHighArch I asked a new question related to this to get clarification on your comment. http://stackoverflow.com/questions/19230073/parsing-a-string-to-extract-a-url-or-folder-path – Zack Oct 07 '13 at 16:36

1 Answers1

2

have your tried :

[^ ]+?:(//[^ ]*|\\.+\\[^ ]*)

it will match :

http://www.google.com

and

C:\windows\temp internetfiles\

in the string where http://www.google.com is the way to go if you want to save your file to C:\windows\temp internetfiles\ quick and easy

Sedecimdies
  • 152
  • 1
  • 10
  • I asked a new question similar to this based on Dour High Arch's feedback about the Uri and FileInfo classes. http://stackoverflow.com/questions/19230073/parsing-a-string-to-extract-a-url-or-folder-path The new question more clearly illustrates what I was trying to do when parsing a URL or filepath from a string. – Zack Oct 07 '13 at 16:38