6

I would like a regular expression to match a valid absolute Windows directory path, where directory names can contain spaces.

Example matches:

C:\pictures\holiday  (without trailing backslash)
C:\pictures\holiday\ (or with trailing backslash)
C:\ pictures\holiday
C:\ pictures\holiday\
C:\pictures \ holiday
C:\pictures \ holiday\
C:\pictures\ holiday \

Example fails:

\pictures\holiday (no relative path allowed)
C:\pictures*\holiday (not a valid directory path)

I have tried ^[a-zA-Z]:(\\\w+)*([\\])?$ but that does not match the spaces.

I have also tried ^[a-zA-Z]:(\s)*(\\\w+)*(\s)*([\\])?$ but that works erratically.

Regular expressions are my last resort. I have also tried to validate the text box using a non-regex solution, like in this answer. But I have not found a method that works for spaces.

Thanks in advance!

Community
  • 1
  • 1
InvalidBrainException
  • 2,312
  • 8
  • 32
  • 41
  • Are you trying to detect path traversal attacks? Are you expecting any dot symbols? – Paddy Jul 11 '14 at 16:49
  • @Paddy Nothing so cool :) I'm developing an application where the user has to be able to specify a directory via a text box. I need the regex to validate that text box. Maybe I should take path traversal attacks into account? – InvalidBrainException Jul 11 '14 at 16:56
  • 2
    why it `C:\pictures@!#$1afaf\holiday` isn't a valid directory? I could be able to create a directory like that. – Avinash Raj Jul 11 '14 at 16:57
  • @AvinashRaj You're right, I just checked and the only invalid Shift+number character for a directory is the asterisk. Editing my post. – InvalidBrainException Jul 11 '14 at 16:59
  • 3
    Why isn't "C:\pictures\holiday\photo.jpg" a valid directory path? I can name a directory "photo.jpg". There's no difference in the accepted pattern for directories versus filenames because they're both "files", just with different attributes. – Brian Stephens Jul 11 '14 at 16:59
  • @BrianStephens Thanks, yes indeed that should be a valid directory path. Editing my post. My excuse is that it's Friday and after work for me :) – InvalidBrainException Jul 11 '14 at 17:02
  • 1
    No threats if the user is specifying a location on his own machine. If the path data is not coming to your machine, why would you worry? If the user is specifying a file/folder location to be uploaded, the file reading part of your program will complain anyway. And the file-reader will handle more cases (like dots, network paths etc.) – Paddy Jul 11 '14 at 17:05
  • Go home Terribad. Start on Monday :-). Finding reg-ex doesn't seem to be the last problem you'll face on this. – Paddy Jul 11 '14 at 17:07
  • @Paddy I probably should, but part of me feels _this_ close to getting it!! Typical OCD programmer, I may be. – InvalidBrainException Jul 11 '14 at 17:09
  • 1
    I know the feeling. We are this close to building a fusion reactor. For 50 years now. – Paddy Jul 11 '14 at 17:13
  • 1
    I found a better solution to this problem that does not require regular expressions. The solution is to use an improved folder browser, namely the Windows 7 browser. Install the Windows 7 API Code Pack. In Visual Studio 2013 this can be done by going to `Tools -> Library Package Manager -> Package Manager Console` and running the command `Install-Package Windows7APICodePack-Shell`. Then you will have access to `Microsoft.WindowsAPICodePack.Dialogs.CommonOpenFileDialog` which includes all the validation! SHAZAM!! – InvalidBrainException Jul 14 '14 at 09:12

3 Answers3

13

Here's a regex that will work:

^[a-zA-Z]:\\(((?![<>:"/\\|?*]).)+((?<![ .])\\)?)*$

It makes the path conform to the NTFS standard (see the MSDN spec). I'll break it down:

^[a-zA-Z]:\\ matches single drive letter, with colon and backslash

(?![<>:"/\\|?*]) is a negative lookahead to ensure the next character is not invalid

((?![<>:"/\\|?*]).)+ wraps that lookahead, followed by the next character, any number of times

(?<![ .])\\ is a negative lookbehind to ensure the file/directory doesn't end with a space or period. Please note: Lookbehinds are not fully implemented everywhere just yet.

All of that is is repeated 0 to many times, with the last backslash optional.

For many use cases it may be best to restrict the path length to 256 characters. To do so, replace *with {0,256}.

EDIT: allow root directory

Brian Stephens
  • 5,161
  • 19
  • 25
  • When I put that in a string literal, Visual Studio doesn't like it and marks it in red. Just to be sure the characters weren't altered during copy-paste, I typed that thing out by hand and it still flagged it as an error. `Match match = Regex.Match(tbOutputFilePath.Text, @"^[a-zA-Z]:\\(((?![<>:"/\\|?*]).)*[^ ]\\)*((?![<>:"/\\|?*]).)*[^ ]\\?$");` – InvalidBrainException Jul 11 '14 at 17:31
  • you need to escape the double-quotes in the regex – Brian Stephens Jul 11 '14 at 17:32
  • Isn't a C# string literal supposed to do that automatically? – InvalidBrainException Jul 11 '14 at 17:34
  • Ah, I think I'm supposed to escape the double quote with another double quote, even in a string literal, as suggested in this answer http://stackoverflow.com/questions/1928909/in-c-can-i-escape-a-double-quote-in-a-verbatim-string-literal – InvalidBrainException Jul 11 '14 at 17:37
  • I also modified my answer because I realized it allowed illegal characters before the backslash. The last character now has to match the lookahead AND a lookbehind that ensures it's not a space or period. – Brian Stephens Jul 11 '14 at 17:40
  • 1
    Alright, I escaped the double-quotes, and this is the resulting regex: `@"^[a-zA-Z]:\\(((?![<>:""/\\|?*]).)*[^ ]\\)*((?![<>:""/\\|?*]).)*[^ ]\\?$"` It doesn't match a directory containing spaces and having more than one level, and it doesn't match the root directory. For example, it doesn't match C:\ or C:\ pictures \ vacation \ (And I think it broke the comment box because I can't wrap those paths in gray background anymore!) xD – InvalidBrainException Jul 11 '14 at 17:43
  • I fixed it to allow for the root directory, and I realized I didn't need the repeated pattern for the last level of the path. It intentionally doesn't match "C:\pictures \" because the spec says a file/directory can't end in a space or period. – Brian Stephens Jul 11 '14 at 17:48
  • Then it would be nice if Windows would prevent the user from creating directories that violate their specs, such as directories ending with spaces. :/ Thanks for all your help, much appreciated! – InvalidBrainException Jul 11 '14 at 17:59
1

Following regex expression work for me for validate custom rules against a path-like string.

/^[a-z]:(((\|/)[a-z0-9\s_@-^!#$%&+={}[]]+)+(\|/)?)$/i

var path="C:\\backup\\newFolder" ; // valid
// var path="C:\\backup\\newFolder\\" ; // valid
// var path="C:\/backup\/newFolder\\" ; // valid
// var path="C:\\\backup\newFolder" ; // invalid
// var path="C:\backup//\newFolder" ; // invalid
// var path="C:\backup\new..Folder" ; // invalid


if((/^[a-z]:(((\\|\/)[a-z0-9\s_@\-^!#$%&+={}\[\]]+)+(\\|\/)?)$/i.test(path))) {
    alert("valid path string");
} else {
    alert("Invalid Path String");
}
0
function isFileOrFolderPathValid(path)
{
    var result =  new RegExp(/^[a-z]:((\\|\/)[a-z0-9\s_@\-^!#$%&+={}\[\]]+)+\.[a-zA-Z0-9]+$/i).test(path);

    if (result === true) return true;

    result = new RegExp(/^[a-z]:((\\|\/)[a-z0-9\s_@\-^!#$%&+={}\[\]]+)+$/i).test(path);    

    //result = /^[a-zA-Z]:\\(\w+\\)*\w*$/.test(path);

    return result;
}
Unheilig
  • 16,196
  • 193
  • 68
  • 98
Perumal
  • 13
  • 4