How to define a new number system in C++

Question

Essentially what I'm attempting to do is create a base 62 number system in C++ (an alphanumeric number system -- one that includes a-z, A-Z, and 0-9). How would something like this be accomplished? I tried using a char array like this:

const char alphaNum[62] = { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', ' y', 'z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z' };

however writing functions to use that array and try to count takes way too much code to be practical (for 0 to 61, sure, just select it from the array. The problem arises when you try to do multiple-digit numbers, i.e. 00). It would be much simpler to just say foobar++;. Does anyone have a way to define number systems, or at least a way for me to make it so that I don't have to write a case for every time it reaches Z?

EDIT: it should have been const char, dunno why VS decided it would be fun not to copy some of it.

Convert from base 62 to a normal C++ integer on object instantiation. Convert from internal representation to base 62 on output. Internal representation should be in whatever format is most convenient. Internally, you would then not need to implement your own math operations. Just use plain old data type operations, and make conversions to base 62 when needed for output. — DavidO, Jun 14 '14 at 20:53

Thomas Matthews · Answer 1 · 2014-06-14T21:11:03.923

You need to separate external (to User) representation and internal representation.

Internal Representation
Internally to the computer, you should use the most efficient representation. This could be hex, binary or decimal; or don't worry about it.

When presenting to the User, you should use the External Representation.

External Representation
Your character array represents digits of your number system. (It should also be const.) You need to isolate the digits from your internal representation. For example, in base 16, we divide the number by 16 to shift the number to the right and use modulo division to get the remainder. The remainder is the digit value. Use the remainder to look up the digit representation from your array.

Try your algorithms on a smaller numeric base, such as 17 or 18. Expanding to base 62 should be a matter of changing a #define or const integer.

Edit 1: The difficult method
A more difficult method is to use one byte for each digit of your number system. A byte, octet or unsigned char, has a range of 0 to 255, so should accommodate a digit of base 62.

Use a std::vector<unsigned char> to represent your number. You'll have to decide if the most significant digit is at the front of the vector or at the end.

To increment a digit:

  add 1 to digit.
  if digit value > 62
  {
     set digit to zero.
     Load digit with next greater column value (i.e. vector[position + 1];
     Repeat at top of algorithm
  }

This is the standard algorithm regardless of base (10, 8, 16, etc.).
The fundamental rules of decimal addition, subtraction, multiplication and division still apply. (Hint).

This is a technique used in the Big Number libraries.

Thank you! Understanding the difference between the two seems to be the key! — Antonio Escalera, Jun 14 '14 at 22:04

Jarod42 · Accepted Answer · 2014-06-15T09:12:26.910

Following may help: (http://ideone.com/y1gZDF) (You may change the internal representation to fit you need, as a BigNumber).

class N62
{
public:
    explicit N62(const std::string& digits) : N62(digits.c_str(), digits.size()) {}
    N62(const char* digits, std::size_t len) : value(0u)
    {
        for (std::size_t i = 0; i != len; ++i) {
            auto pos = std::find(std::begin(base), std::end(base), digits[i]);
            if (pos == std::end(base)) {
                throw std::runtime_error("incorrect digit");
            }
            value *= 62;
            value += pos - std::begin(base);
        }
    }
    N62(std::size_t value) : value(value) {}
    operator std::size_t () const { return value; }
    std::string str() const
    {
        if (value == 0u) {
            return "0";
        }
        std::string res;
        for (std::size_t n = value; n != 0; n /= 62) {
            res.push_back(base[n % 62]);
        }
        std::reverse(res.begin(), res.end());
        return res;
    }

private:
    std::size_t value;
private:
    static constexpr char base[] =
        "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
};

Additionally, you may add a user string literal as follow:

N62 operator "" _n62 (const char *t, std::size_t len)
{
    return N62(t, len);
}

And use it that way : "1Z"_n62 (which resolve in 123 in decimal).

AWESOME answer, thank you! That is exactly what I needed, that ideone link simply put icing on the cake. Thanks! — Antonio Escalera, Jun 14 '14 at 22:04
The array is decaying, causing std::begin and std::end to not work. :S — Antonio Escalera, Jun 14 '14 at 22:52
EDIT: derp, visual studio problem, not a problem with your code. Code::Blocks and GCC works just fine. Thanks! — Antonio Escalera, Jun 14 '14 at 23:04
@AntonioEscalera: Just updated the code to fix some conversion error. — Jarod42, Jun 15 '14 at 09:13
About Visual, it doesn't support (yet) user-defined string literal, So you can't use `operator ""`. — Jarod42, Jun 16 '14 at 06:55

score 1 · Answer 3 · answered Jun 14 '14 at 21:09

Number systems are only representations of numbers. Numbers don't mind how we writing them. So, our problems are:

How to use base62 to declare numbers in the program text?

You should declare how you mark base62 numbers. "0x" and "$" prefix is reserved for base16, "0" prefix is reserved to octal, "%" prefix for binary, "*" is used for several reasons maybe tilde is your prefix, example: ~Uu3n

Then, you should write a preprocessor, which converts your base62 numbers to decimal (or hexadecimal, as you want) format.

How to output numbers in base62?

It's harder. You should convert all number output to base62. So

 printf("%d in base62 is %D \n",value,value);

shoud be converted to

 printf("%d in base62 is %s \n",value,tob64(value));

Not too elegant.

Maybe a wrapper class should be better (like Integer in Java).

How to define a new number system in C++

3 Answers3