0

I need a function that gives a one-to-one map from a string to another string, but the output string the nice property of being a proper name for a file.

More specifically my problem is that, given a URL of an image, I want to save the image with a unique name given that URL. I need a code like this

string url;
string uniqueName = UrlToName (url);
string fileName = path + uniqueName + ".png";

The problem is how to get the UrlToName function. An possible solution could be GetHashCode but I don't know if its correct.

Cristian Garcia
  • 9,630
  • 6
  • 54
  • 75
  • What are the rules? There is no magic. That is, given "hello!world" and "helloworld!", how should they map to a "unique key"? And what should the respective output be? – user2864740 Sep 08 '14 at 02:48
  • 3
    Base64-encode it (but replace `/` with `-`). – Blorgbeard Sep 08 '14 at 02:48
  • @Blorgbeard Might as well just appropriately URI-component encode it (at the usage site) in that case.. – user2864740 Sep 08 '14 at 02:51
  • I think a solution like the one @ianmercer gave is best - think about it - what if more than one user will upload/store files with the same name (me.jpg, portrait.jpg, whatever) - aren't those suppost to be different? Do you really want to give the names of those files back in your urls? – Random Dev Sep 08 '14 at 05:11

2 Answers2

2

Your best bet is to use a database - create a new row for each file, store the Uri in one column and a generated file name in another column (e.g. Guid.NewGuid().ToString() + ".png").

This is immune to any problems with file name lengths (~255 characters in NTFS whilst a URL could be ~2000), there is no chance of a hash collision and you can evolve your storage algorithm over time as your database grows, for example, adding directories so that you don't end up with too many files in a single directory (which makes it unuseable in Explorer).

You should also be concerned about security risks if you ever create file names on a server based on external input. Much of the advice in this answer applies here too.

Community
  • 1
  • 1
Ian Mercer
  • 38,490
  • 8
  • 97
  • 133
  • Will a JSON file serve for this purpose? – Cristian Garcia Sep 09 '14 at 19:57
  • Any 'database' could be used: a simple JSON file that you read and write would work but it wouldn't do well at scale or in a multiserver environment. SQLExpress would be better, or you could try something like MongoDB that's incredibly easy to install, run and to serialize and deserialize data to/from. – Ian Mercer Sep 09 '14 at 21:02
  • Thanks. Its really to keep track of downloaded images on a mobile device, so a database might be too much for the job. – Cristian Garcia Sep 14 '14 at 18:08
  • SQLite is available on most mobile devices and will serve you better in the long run than rolling your own file-based-persistence. – Ian Mercer Sep 15 '14 at 04:41
0

Try something like this:

private string UrlToName(string url)
{
    foreach (char c in System.IO.Path.GetInvalidFileNameChars())
    {
        url = url.Replace(c, '_');
    }
    return url;
}

This will make sure that it has valid file name characters. As long as the url being passed in is unique, then this function should always return a unique string back.

If you do need a truly unique name, even if the same URL gets passed in, try this:

private string UrlToName(string url)
{
    url = url + "_" + DateTime.Now.ToString("o");
    foreach (char c in System.IO.Path.GetInvalidFileNameChars())
    {
        url = url.Replace(c, '_');
    }
    return url;
}

This will add a date stamp, down to the milliseconds to the end of the string. Unless you pass in the exact same URL, at the same exact millisecond (highly unlikely), you will get a unique string back everytime.

Icemanind
  • 47,519
  • 50
  • 171
  • 296
  • 1
    Not quite: for example, `some/url` and `some>url` will return the same string. – Blorgbeard Sep 08 '14 at 03:03
  • @Blorgbeard -- True. I would argue however that a '>' sign isn't a valid url either, as noted by [this SO Answer](http://stackoverflow.com/questions/1547899/which-characters-make-a-url-invalid). But I did just update my answer to make it even more unique. – Icemanind Sep 08 '14 at 03:04
  • Should **http://server/image.jpg** and **http://server/image.jpg?a=1** return a different string, even when it's the same image? – thepirat000 Sep 08 '14 at 03:14
  • @thepirat000 - Yes. Because the URL is different, it will return a different string. It doesn't care what image is. It doesn't even care if it is an image. – Icemanind Sep 08 '14 at 03:17
  • A Url could be ~2000 characters long, a file name ~255 so this isn't going to work very well. Plus, using DateTime.Now is almost never the right answer because in many timezones it jumps backward once a year when the clocks change. – Ian Mercer Sep 08 '14 at 16:26
  • @IanMercer -- A filename is Windows has a limit of 32,767 characters, unless he's using an old operating system like Windows 95 or Windows 98. The actual path name (e.g. C:\Somedir\SomeDir\SomeDir) can't be over 255 characters, but the filename its can be up to 32,767. – Icemanind Sep 08 '14 at 17:12
  • Straight from MSDN: http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx#maxpath – Icemanind Sep 08 '14 at 17:30
  • That reference discusses *Path length* limits not *File name* length limits. See http://msdn.microsoft.com/en-us/library/windows/desktop/ee681827(v=vs.85).aspx which lists both the maximum file name length and the maximum path length for each filesystem type. – Ian Mercer Sep 08 '14 at 20:59