Efficient algorithm for conversion between numeral system

Question

Is there any efficient algorithm for conversion between numeral system when the size of source integer is arbitrary?

For example, assume that there is an integer array {1, 4, 8} which is 148 in decimal format as an input. It might be converted to {9, 4} in hexadecimal format, or {2, 2, 4} in octal, or {1, 0, 0, 1, 0, 1, 0, 0} in binary format, or just {148} in 1234-ary format or something.

It's simple when the actual value can be expressed in word-size supported by machine. But when it goes to arbitrary size, I cannot find efficient way better than O(n^2).

Should be possible in O(n). You might want to (also) try on math.stackexchange.com. — Michel de Ruiter, Aug 23 '10 at 08:53

score 3 · Answer 1 · answered Aug 23 '10 at 08:53

3

Divide by the base, push back the module, rinse and repeat while quotient != 0.

So, for example convert 148 in base 16

148 / 16 = 9 r 4
  9 / 16 = 0 r 9

So, 148 in hex is 0x94. This shouldn't take so long.

answered Aug 23 '10 at 08:53

Federico klez Culloca

26,308
17
56
95

Unfortunately, it is not that simple when a number is big enough - think about 10^100, which can't be represented in 1 atomic integer in current computer architecture. – summerlight Aug 23 '10 at 15:19
Of course, some abstraction like big_int class can be used, but it makes the problem worse, since the time complexity of a modular and division operation is O(n^2) and we need O(n) modular and division operation for a conversion. – summerlight Aug 23 '10 at 15:30

score 0 · Answer 2 · answered Sep 09 '22 at 17:46

Suppose we want to convert from base b1 to base b2. If b1 and b2 are both powers of a common base b say b1 = b^i1 and b2 = b^i2 then this can be done in O(n). In particular this applies to say converting between binary, octal, hexadecimal, 2^32, 2^64 etc. That's because you're really just repartitioning the bits. Consider the bits of the digits of number 3214 in octal vs hexadecimal:

3214 = 12*16^2 + 8*16 + 14 = 6*8^3 + 2*8^2 + 1*8 + 6
     = {1100, 1000, 1110} base 16
     = {110, 010, 001, 110} base 8

You can see that it's all the same bits just divided differently among the digits in each base. It's not hard to make an algorithm that converts from one to the other using bit-shifts and masks. The same would go for converting from e.g. base 9 to base 27 except that you partition the base 3 digits rather than bits and then shift those into different buckets.

Now if b1 and b2 are not commensurate then it's not so simple but it can still be done in subquadratic time using divide and conquer approaches. Section 7.1 of the textbook [1] describes algorithms 1.25 and 1.26 for base conversion. The presumption of those algorithms is that you are converting into and out of the base that is used for big integer arithmetic (and that you can do big integer arithmetic in that base). You can use this to implement arbitrary base conversion by converting b1 -> bigint -> b2 though.

Here I'll demonstrate algorithm 1.25 in Python which naturally supports big integer arithmetic out of the box:

# Algorithm 1.25:

def parse_int(S: str, B: int = 10) -> int:
    """parse string S as an integer in base B"""
    m = len(S)
    l = list(map(int, S[::-1]))
    b, k = B, m
    while k > 1:
        last = [l[-1]] if k % 2 == 1 else []
        l = [l1 + b*l2 for l1, l2 in zip(l[::2], l[1::2])]
        l.extend(last)
        b, k = b**2, (k + 1) >> 1
    [l0] = l
    return l0

Because this first converts all characters to int with str this implementation will only work for bases from 2 to 10 but had the input been provided as a list of integers rather than a string then that step wouldn't be necessary. Note that it isn't explicit here in this code but l1 + b*l2 and b**2 need to be computed with big integer arithmetic. The result returned here is a big integer.

Algorithm 1.26 can convert a big integer to a string of digits in any base. A simple implementation:

# Algorithm 1.26:

def format_int(A: int, B: int = 10) -> str:
    """Format integer A as a string in base B"""
    if A < B:
        # Here we use str for the base case of a single digit but probably
        # there should be some cutoff for using an O(n^2) algorithm for
        # sufficiently small inputs.
        return str(A)
    else:
        # find k so that B**(2*k-2) <= A < B**(2*k)
        k = 1
        B2k = B2 = B**2
        while A > B2k:
            k += 1
            B2k *= B2
        if A == B2k:
            k += 1
        # assert B**(2*k - 2) <= A < B**(2*k)

        Q, R = divmod(A, B**k)
        r = format_int(R, B)
        q = format_int(Q, B)
        pad = '0'*(k - len(r))
        return ''.join([q, pad, r])

These algorithms both have complexity M(n)*log(n) where n is the number of bits in the big integer and M(n) is the cost of multiplying two n bit integers. If the big integer multiplication and division has subquadratic performance for multiplication then these algorithms will as well.

[1] Richard P. Brent and Paul Zimmermann, Modern Computer Arithmetic. https://members.loria.fr/PZimmermann/mca/mca-cup-0.5.9.pdf

Efficient algorithm for conversion between numeral system

2 Answers2

Linked

Related