4

There are 26 characters in the alphabet (abc..yz) and 10 digits (0..9). That gives us a lexicon of 62 characters to use if we go case sensitive.

At the moment we are building a part of a filename based on an ID in our database. These numbers can get quite long so we would like to shorten them. For example instead of having:

file_459123.exe

We would rather:

file_aB5.exe

Does anyone have a method in C# that can convert an int into a shorter case sensitive string, and convert a case sensitive string back into an int?

Example (doesn't have to be this pattern):

1 = 1
2 = 2
...
9 = 9
10 = a
11 = b
...
36 = z
37 = A
user000001
  • 32,226
  • 12
  • 81
  • 108
Tom Gullen
  • 61,249
  • 84
  • 283
  • 456
  • 1
    Convert int to bytes; remove zeros from big-end; run it through base-64... reverse: run it through base-64; pad with zeros on the big-end; convert bytes to int – Marc Gravell Jan 16 '12 at 20:06
  • 2
    Is this for windows? (almost certainly as this tagged c#, asp.net). If so, watch out as the file system is case insensitive. You really only have upper or lower case letters, not both, when doing this. – Kevin Brock Jan 16 '12 at 20:54
  • 1
    Aside: In cases where I have had to manage large numbers of files on Windows I have run into issues with performance of large directories. My solution has been to break apart the identification number for the file and use the pieces for a directory path, e.g. Invoice_012345.pdf would be a file in C:\HeapOStuff\01\23\. Limiting directories to 100 files worked well for my application. – HABO Jan 16 '12 at 21:09
  • 1
    I think `ToString("x")` is enough since you won't get **much** shorter string with Base64 (considering an int) these are strings for 65535 `//8=` `ffff` or for MaxInt `////fw==` `7fffffff` – L.B Jan 16 '12 at 21:32

4 Answers4

8

Despite the Base64 references, here's a generic (non-optimized) solution:

// for decimal to hexadecimal conversion use this:
//var digits = "0123456789abcdef".ToCharArray();

var digits = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
             .ToCharArray();

int number = 459123;
string output = "";

do
{
    var digit = digits[(int)(number%digits.Length)];
    output = output.Insert(0, digit.ToString());
    number = (int)number/digits.Length;
}
while (number > 0);

// output is now "1Vrd"
M4N
  • 94,805
  • 45
  • 217
  • 260
6

Try Base64

  • 1
    Your link goes to the Microsoft Visual C++ 2008 Redistributable Package page. You may wish to correct it. – Chris Dunaway Jan 16 '12 at 20:07
  • 1
    Sorry, will do. I'm typing on my iPad, its just not as reliable as a good old keyboard. –  Jan 16 '12 at 20:19
  • Base64 does not work for Windows files names (NTFS/Fat) since these systems are not case sensitive - use base32 instead, or choose some different base set of characters that does not include both upper and lower case of the same letters. (Same issue with some Url cases - not every server will respect different casing). – Alexei Levenkov Jan 16 '12 at 20:55
  • @AlexeiLevenkov: Thats right.. The OP didnt realize either, as he suggested mixed case. –  Jan 16 '12 at 21:00
6

just expanding M4Ns solution to a generic class....

  public class BaseX
    {
        private readonly string _digits;

        public BaseX(string digits)
        {
            _digits = digits;
        }
        public string ToBaseX(int number)
        {           
            var output = "";
            do
            {                
                output = _digits[number % _digits.Length] + output;
                number = number / _digits.Length;
            }
            while (number > 0);
            return output;
        }

        public int FromBaseX(string number)
        {
            return number.Aggregate(0, (a, c) => a*_digits.Length + _digits.IndexOf(c));
        }
    }

and then you can do...

var x = new BaseX("0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ");
            Console.WriteLine(x.ToBaseX(10));
            Console.WriteLine(x.ToBaseX(459123));
            Console.WriteLine(x.ToBaseX(63));

            Console.WriteLine(x.FromBaseX("1Vrd"));
            Console.WriteLine(x.FromBaseX("A"));

            var bin = new BaseX("01");
            Console.WriteLine(bin.ToBaseX(10));
Keith Nicholas
  • 43,549
  • 15
  • 93
  • 156
3

Depending on your situation you might want to look at using Base32 as the restricted character set may be easier to read (I.E., some users cannot easily distinguish the difference between zero and the letter o).

You can find examples here and here.

Community
  • 1
  • 1
Kane
  • 16,471
  • 11
  • 61
  • 86
  • +1: this is a better idea than base64 for case insensitive file systems. –  Jan 16 '12 at 21:13