8

I am allowing the user to choose any username he wants and it can be anything at all such as

AC♀¿!$"Man'@

Now i need to create a directory for him. What function i use to escape the text so i dont a FS problem/exception?

6 Answers6

10

Whether you replace invalid characters or remove them, there's always going to be the possibility of collisions. If I were you, I'd have a separate primary key for the user (a GUID perhaps) and use that for the directory name. That way you can have your user names anything you'd like without worrying about invalid directory names.

Jacob
  • 77,566
  • 24
  • 149
  • 228
  • 3
    Why the downvote, anonymous user? I still believe this is a better solution. Otherwise, Bobby\ and Bobby/ would both get the same directory, and that would cause a conflict. – Jacob Jul 06 '11 at 17:04
  • The question was for a text escaping strategy compatible with filename rules. Avoiding escaping altogether by creating and maintaining a seperate datastructure may be a solution for the problem, but does not answer the question. – Frep D-Oronge Aug 30 '11 at 10:09
  • 2
    Meh, I think people mainly go to Stack Overflow to find solutions to their problems, not just answers to their questions as asked. If someone's asking the wrong question, like "how do I parse this HTML with a regular expression," it's better to point out the flaws in the approach. – Jacob Aug 30 '11 at 15:05
  • Actually, I stumbled accross this question because I really needed to escape strings for filenames, could not implement your workaround. In otherwords, I had the exact problem as described by the question. Appreciate your effort, but the problem is a valid one, and requires answering, not dodging. – Frep D-Oronge Oct 10 '11 at 12:33
6

Depending on if your characters are ASCII/Unicode, you can use the byte/character values as a replacement and use some character to mark these replaced characters (like an underscore), giving you something like _001_255_200ValidPart_095_253. Note that you have to replace the marking characters, too (_095 in the example, 95 is the ASCII code for the underscore).

schnaader
  • 49,103
  • 10
  • 104
  • 136
4

Use Path.GetInvalidFileNameChars or Path.GetInvalidPathChars to check for characters to remove.

http://msdn.microsoft.com/en-us/library/system.io.path.getinvalidfilenamechars.aspx

Dan
  • 2,338
  • 17
  • 28
  • And be sure to check if the directory already exists (check for duplicate "usernames"). – schnaader Jun 20 '09 at 05:37
  • 1
    This doesnt help. I need a lossless escape, removing characters is lossy and may cause collision after two different names have characters removed –  Jun 20 '09 at 05:39
  • 4
    Also note that while GetInvalidPathChars() returns characters that are invalid, it doesn't necessarily return ALL characters that are invalid. Quote: "The array returned from this method is not guaranteed to contain the complete set of characters that are invalid in file and directory names." – Bevan Jun 20 '09 at 06:20
  • -1 avoid this function like the plague. It's more trouble than it's worth. As said by Bevan, it does not return ALL invalid characters, making it completely, utterly useless. – Roman Starkov Dec 27 '09 at 11:10
  • @romkyns, could you give an example of what chars are not returned? If this answer is not sufficient, lets provide one that is better... – Dan Jul 06 '11 at 18:44
  • @Dan I don't think it returns the * for example, although I haven't checked since .NET 2. – Roman Starkov Jul 07 '11 at 16:07
  • 1
    How about converting the username to base46 representation? – henon Feb 20 '14 at 13:53
2

I think your best bet would be to create a Dictionary that maps invalid filesystem characters to a valid replacement string. Then scan the string for invalid characters, replacing with valid strings as you go. This would allow the user to pick anything they want and give you the consistency to translate it back into the user's name if you want.

jasonh
  • 29,297
  • 11
  • 59
  • 61
  • But note that you have to be able to say which characters have been replaced afterwards. Turning $User to DollarUser won't help. – schnaader Jun 20 '09 at 06:00
  • True, but you could create a sequence of characters that a user wouldn't be allowed to use. Much like in C#, when the compiler emits code for certain things. – jasonh Jun 20 '09 at 06:03
1

You're not going to find a way to escape the username that will give a valid non-clashing directory name in every instance.

A more reliable approach will be to create the directory using some arbitary convention, and then store a mapping between the two. This also provides support for the case where your user wants the ability to change name.

Check out Question #663520 for more on this.

Community
  • 1
  • 1
Bevan
  • 43,618
  • 10
  • 81
  • 133
0

does user need to know the exact name of his / her directory ? If not, why not create directories using arbitrary rules and associate each one to his owner in a DB or something ?

nairdaen
  • 1,037
  • 2
  • 11
  • 19
  • That would work, i could always use the userID instead of name. I just want the name bc it is more readable for the admins although perhaps pointless. –  Jun 20 '09 at 05:47