0

I created an winforms application, which creates a small SQLite database for every user. The database name contains the e-mailaddress of the user.

Scenario:

a user with e-mailaddress: test@test.com uses the application, the following database will be created:

test@test.com.sqlite

I know that there are scenario's that the e-mailaddress converted to a path, will not be converted to a valid path.

the following characters are reserved by path: / \ ? % * : | " < > . and a space

I'm currently using the next method to filter the e-mailaddress:

public string ReturnFilteredUsername(string username)
{
            username = username.Replace("<", "x"); username = username.Replace(">", "x"); username = username.Replace(":", "x"); username = username.Replace("\"", "x");
            username = username.Replace("/", "x"); username = username.Replace(@"\", "x"); username = username.Replace("|", "x"); username = username.Replace("?", "x");
            username = username.Replace("*", "x");
            return username;
 }

All reserved characters get replaced by an "x" at the moment, I want to convert all invalid characters to "%20". It is possible for me to just replace the "x" with "%20", but I don't think that's a 'clean' method

Can somebody come up with a more 'clean' method?

Thanks in advance!

Max
  • 12,622
  • 16
  • 73
  • 101
  • 1
    Replacing every invalid character with the same replacement character can lead to collisions. "AB" are different, but your method would make them both "AxB". – yoozer8 Apr 26 '13 at 12:55
  • Coding hint: (StyleCop) SA1107 : CSharp.Readability : A line may only contain a single statement. – WhileTrueSleep Apr 26 '13 at 13:31

1 Answers1

0

Rather than using the user's email to name the database, identify each user by a numeric ID, and use that to name the databases.

In addition to always being valid as a file name, this allows users to change their email address without you having to worry about still pointing to the correct database.


Alternately, if you're set on using email addresses for the names, see this answer about which characters are valid in an email address. As long as all of the email address are valid, you only need to worry about escaping those characters that are both valid for emails and invalid for path.

I'd suggest either using a valid-for-path and invalid-for-email character to start an escape sequence (different sequence for each character you'll have to escape), or selecting a less common character valid for both, and using that as an escape character (remembering to escape it as well!).

Example:

Both % and ? are listed as valid for emails and invalid for paths. Assuming we use & to escape (valid for both!), we would create a mapping like this:

"&" = &00;
"%" = &01;
"?" = &02;

You would go through each email address and replace each occurence of invalid characters with their escaped equivalent, making it both as unique as the email address and safe as a path anme.

"a&b?100%@example.com" would becomee "a&00;b&02;100&01;@example.com".

Community
  • 1
  • 1
yoozer8
  • 7,361
  • 7
  • 58
  • 93
  • it is not very practical, since it is possible that 30+ users login to the application, it will not be easy searching for the right database, when they are called like: 00001, 00002, etc etc – Max Apr 26 '13 at 12:59
  • How would that be any less practical than looking for the right database when they're named "johndoe@example.com", "bob@otherexample.net", etc? If anything, it's easier, since it will always be a valid path name and you won't ever have collisions. – yoozer8 Apr 26 '13 at 13:09