6

I'm working with numbers in different bases (base-10, base-8, base-16, etc). I'm trying to count the number of characters in each number.

Example

Number: ABCDEF

Number of digits: 6

I know about the method based on logarithms but I'm facing some problems.

  1. This Python script outputs that it failed to calculate the number of digits correctly in 3,969 numbers out of 1,000,000.

  2. I think the method that uses logarithms could be rather slow

Links:

  • This C program must be very slow (what if I have a very great number?). It also can't deal with numbers in different bases (for example, base-16).

  • Not a dupe of this as there the OP was asking only about base-10


Edit: certainly I can calculate the length of a string but what interests me most, is if it is possible to do the calculation without convention to string. I'd want to know the algorithm that could help to do it knowing just the source-base and the base to convert to.

Edit2: source-base is base-10 and the base to convert to can be any other base.


How can we calculate the number of digits in numbers in different bases?

If I know the number in base-10, how do I calculate the number of digits in the same number converted to base-16 (base-8, etc) without performing the conversion?

Note: some Python or C code will be greatly appreciated

Community
  • 1
  • 1
ForceBru
  • 43,482
  • 10
  • 63
  • 98
  • Just an idea before writting you a whole answer, should a method like that satisfy you : find the power n needed to have for example 16^n > your_number > 16^n , because then the number of digit should be something like n... – Emmanuel Jay Apr 24 '15 at 12:31
  • 1
    Are you asking us how to debug your Python script? – abarnert Apr 24 '15 at 12:36
  • @EmmanuelJay, I think any method that's fast enough would suit. – ForceBru Apr 24 '15 at 12:36
  • @abarnert, no, I think I can manage with this myself. This script was made to test the method that uses logarithms. – ForceBru Apr 24 '15 at 12:37
  • @vib, oh, yes, I forgot to mention that `source_base` is base-10, oops, editing now... – ForceBru Apr 24 '15 at 12:44

4 Answers4

11

Logarithms shouldn't really be slow. And you can easily calculate logarithms to any base by this formula: logBaseN(x)=logBaseA(x)/logBaseA(N) - you can use ln(Base e = 2.718...) or logBase10 or whatever you have. So you don't really need a program, a formular should do it:

num_digets(N, base) = 1 + floor(log(N) / log(base))

where N is your number and base the base you want that number in.

For more explanation take a look here: http://www.mathpath.org/concepts/Num/numdigits.htm

LouYu
  • 671
  • 1
  • 6
  • 14
kratenko
  • 7,354
  • 4
  • 36
  • 61
  • Google managed to hide this link from me, gonna investigate it now – ForceBru Apr 24 '15 at 12:40
  • Thanks. And i guess n = 0 will be a special case? – Mattias Wadman Nov 10 '20 at 11:57
  • 1
    @MattiasWadman Yes, of course. That would be `1` in all bases. You would also need to take care of negative numbers, if that is an issue for you (and decide what is the sensible return value in your case - number of digits would be identical for `N` and `-N`, but for display you might want to add an additional character for the `-`). – kratenko Nov 13 '20 at 10:48
  • This does not work in practice as shown below. – smichr Aug 17 '23 at 01:02
2

Note your NumToStr() function in your Python code is incorrect due to an off-by-one in your base, it should be:

def NumToStr(num):
    str=''
    while num:
            str+=alpha[(num)%base]
            num=(num)/base
    return ''.join(list(reversed(str)))

Note that checking that this function returns the correct result would have found the error (for example, use alpha="0123456789").

With this fix we get the correct number of digits using the given formula:

nDigits = int(ceil(log(nmb, base)))

except for exact powers of the base (base**0, base**1, base**2, etc...) where it is exactly one less than what it should be. This can be fixed by changing the forumla slightly to:

nDigits = 1 + floor(log(nmb, base))

Note that even this seems to fail for some inputs (for example converting from base-10 to base-10 it incorrectly says 3 digits for 1000 and 6 digits for 1000000). This seems to be due to the inherent inprecision of floats, for example:

print floor(log(1000, 10))

outputs 2 instead of the expected 3.

Concerning your mention about performance, I would not worry about performance issues for such trivial code until you've done profiling/benchmarking that shows it to be an issue. For example, your "very slow" C code would only take at most 38 divisions by 10 for a 128-bit number. If you need better performance than this then you would run into the same issue with any trivial method mentioned here. Fastest thing I can think of would be a custom log() function using a combination of lookup table and linear interpolation but you would have to be careful about the resulting precision.

uesp
  • 6,194
  • 20
  • 15
1

I am not sure I understand your question. When you say your initial number is in base b1, does it mean you have a representation of it as a string in base b1 ? Maybe you want to construct some table which tells you which number in base b1 correspond to b2, b2^2, b2^3, ... and then compare your number to these numbers to see where it fits.

Otherwise I would go with the algorithm you mentioned, that can easily adopted to any base.

Input : your integer x, the base b2 you want to count the digits in.

int number_of_digits (int x, int b2) {
    int n = 0;
    while (x >0) {
        x/=b2;
        n++;
    }
    return n;
}

Both methods are only linear in n.

EDIT : If you want to be faster, you can implement this as a binary search. Then you can get O(log(n)).

vib
  • 2,254
  • 2
  • 18
  • 36
1

This problem is a little more delicate than suggested by the answer:

>>> num_digets(1000, 10)  # there are 4 digits
3
>>> [num_digets(1000**i,10)%3 for i in range(1,10)]  # these should all be 1
[0, 0, 0, 0, 0, 0, 0, 0, 0]

def ndigits(N, b):
    if b < 2:
        raise ValueError('base must be integer greater than 1')
    n = abs(N)
    if n < b:
        return 1
    if b == 2:
        return n.bit_length()
    d = math.floor(math.log10(n)/math.log10(b))
    b_ = b**d
    while b_ <= n:  # this will iterate 0, 1 or 2 times
        d += 1
        b_ *= b
    return d

>>> [ndigits(1000**i,10)%3 for i in range(1,10)]
[1, 1, 1, 1, 1, 1, 1, 1, 1]

The reason this requires extra care is that math.log(1000**i)/math.log(10) is a little smaller than the expected floating integer:

import math
>>> math.log(1000)/math.log(10)
2.9999999999999996

If you use math.log10 in base 10 it will work as expected, but then it will fail for other bases, like 125 in base 5. Here is a small set of tests for whatever formula you use:

    assert ndigits(2, 2) == 2
    assert ndigits(2**48 - 1, 2) == 48
    assert ndigits(1000, 10) == 4
    assert ndigits(125, 5) == 4
    assert ndigits(100, 16) == 2

SymPy has a function integer_log that can be used to calculate that exact value so the answer is 1 + integer_log(n, b)[0]. It is not complicated but it is slower than the direct use of math.log to do the calculation.

smichr
  • 16,948
  • 2
  • 27
  • 34
  • 1
    (I don't understand `If you care are`.) (You might plant a hyperlink to [kratenko's answer](/a/29847712/3789665) to not have your readers wonder whence `num_digets()`.) – greybeard Aug 17 '23 at 09:41