0

I want to convert a BigInteger to a text string in the optimal way. There are 95 printable ASCII characters, numbered 32 to 126. I want to use these characters to convert BigInteger to text. Same as this code, but is for Uint:

static string ConvertToBase64Arithmetic(uint i)
{
    const string alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
    StringBuilder sb = new StringBuilder();

    do
    {
        sb.Insert(0, alphabet[(int)(i % 64)]);
        i = i / 64;
    } while (i != 0);
    return sb.ToString();
}

My goal is to make a smaller text. It is clear that if it is done as byte storage( BigInteger.ToByteArray()), it is the most optimal mode. But I'm just looking for a shorter text string lenght and Base95 is just a suggestion. my code:

static string ConvertToBase95Arithmetic(BigInteger i)
    {
        const string alphabet = "mcW=2`R\\.5+46L\" !#$%&'()*,-/013789:;<=?@ABCDEFGHIJKMNOPQSTUVXYZ[]^_abdefghijklnopqrstuvwxyz{|}~";
        StringBuilder sb = new StringBuilder();

        do
        {
            sb.Insert(0, alphabet[(BigInteger)(i % 95)]);
            i = i / 64;
        } while (i != 0);
        return sb.ToString();
    }

It is natural that this text string can be converted back to the original number and the data will not be lost.

  • 2
    Well, you would want `i = i / 95;` in the loop, not 64. But really, what have you gained? The number 2**128, for example, is 22 digits in base 64, and 20 digits in base 95. Is that tiny improvement really worth the trouble? Base64 is a standard. – Tim Roberts Aug 27 '22 at 19:05
  • I know it is a difficult challenge, but it is an achievement for all C# programmers. It really saves data storage in big data. –  Aug 27 '22 at 19:13
  • 2
    It is not difficult at all, and it does not "really save" storage. Base 64 stores 6 bits per digit. Base 95 stores 6.57 bits per digit. That's only a 10% savings. – Tim Roberts Aug 27 '22 at 19:32
  • If you have "big data" you probably don't want to be storing it like this anyway. Just store the raw bytes, that gets you much more saving than 10% – Charlieface Aug 27 '22 at 21:58
  • 1
    use [base91](http://base91.sourceforge.net/) instead. It results almost in the same length sequence but will be significantly faster because you can use a lookup table instead of slow division. [What is the most efficient binary to text encoding?](https://stackoverflow.com/q/971211/995714) – phuclv Aug 28 '22 at 01:02

0 Answers0