Edit Oh dear! Just after I have acknowledged that my answer misses the point, it got accepted. I've left the original answer as it is and explain the idea behind my algorithm and how it is inferior to the original solution after the fold.
This particular problem should be treated as "all integers with 18 digits" rather than "all integers between 1 and 10^18". (For the digit sum, numbers with fewer than 18 digits can be treated as 18-digit numbers with leading zeros.)
Then you can use an algorithm that spreads from the bottom up, just like the sieve of Erathostenes spreads across all non-primes.
Start with an array of digit counts dig
for the digits 1 to 9 representing 0, i.e. all zeros. (The number of zeros can be calculated as 18 - sum(dig)
. Then you can recurse like this:
recurse(dig[], pos) {
if (digitsum(dig) > 60) return;
if (digitsum(dig) == 60) {
count += poss(dig)
} else {
if (pos < 9) recurse(dig, pos + 1);
if (sum(dig) < 18) {
dig[pos]++;
recurse(dig, pos);
dig[pos]--;
}
}
}
That way you treat all combinations of digit counts and return if you exceed 60. When you hit 60 exactly, you must calculate the number of possible numbers that correspond to that digit count:
poss(dig) = 18! / prod(dig[i]!)
The product of the factorials prod(dig[i]!)
must include the factorial of zeros. (And, of course, 0! == 1
.)
That approach should be fast enough if you keep track of the sum so far and precalculate the factorials. It doesn't work if you want to calculate all numbers with a digit sum of 60 between 50 and 5,000,000,000, say.
Addendum The framework that you have linked to can treat any range from A to B. Here, let's focus on ranges from 0 to 10^n, that is n-digit numbers where numbers with fewer digits are considered to have leading zeros.
The idea of my algorithm is not to enumerate all numbers, but to consider counts of digits. For example if my number has 5x the digit 9 and 3x the digit 5, the digit sum is 60. Now we have to find how many numbers of 18 digits satisfy that condition. 590,050,005,090,900,099 is one such number, but all unique permutations of this number are valid, too. The number has 18 - (5 + 3) = 10 zeros, hence this combination has
N(5x9, 3x5) = 18! / (5! * 3! * 10!)
permutations.
The algorithms must enumerate all permutations. It keeps track of the enumerations in an array, dig
:
^ count
|
2x ... ... ... ... ... ... ... ... ...
1x ... ... ... ... ... ... ... ... ...
0x ... ... ... ... ... ... ... ... ...
---------------------------------------------> pos
1 2 3 4 5 6 7 8 9
The case above would be
dig == [0, 0, 0, 0, 3, 0, 0, 0, 5]
To achieve this, it spreads in a zig-zag pattern. The current digit is called pos
. It can either move vertically by incrementing the count of the current digit by one or it can move horizontally by considering the next digit. Recursion stops if the digit sum hits or exceed S or if pos goes beyond 9. Every time we hit S, we do our permutation calculation as shown above.
Because the array is passed by reference and is virtually the same array throughout we must decrement it after incrementing: We must clean up after tracking back.
This algorithm works and will find the answer to the number of all 18-digit numbers whose digit sum is 60 in the fraction of a second.
But it doesn't scale, because its running time grows exponentially. And also because you can calculate factorials for 18! with 64-bit integers, but after 20! you'll need big-integer arithmetic. (A clever implementation will be able to get further by simplifying the fraction N! / prod(dig[i]!)
, however.)
Now consider the code you posted. I've removed everything that calculates the ranges. The bare-bones version is:
ds_count(i, sum)
{
if (sum > 60) return 0;
if (i == 18) {
if (sum == 60) return 1;
return 0;
}
result = 0;
for (d = 0; d < 10; d++) {
result += ds_count(i + 1, sum + d);
}
return result;
}
This enumerates all 18-digit values. It stops short when the sum exceeds 60, but that's about it. This is not much better than a brute-force solution.
But this solution lends itself to memoizing. It will frequently be called with the same values, and it's easy to see why. For example the call ds_count(2, 5)
will be called from 05...
, 14...
, 23...
, 32...
, 41...
and 50...
. (That reminds me a bit of the differently sized number chips in the board game Settlers of Catan, which account for the sum of two dice rolls.)
If you can determine one of these values, you can save it and effectively cut 5 calls to a tail of 16 digits short. So:
ds_count(i, sum)
{
if (sum > 60) return 0;
if (i == 18) {
if (sum == 60) return 1;
return 0;
}
if (defined(memo[i, sum])) return memo[i, sum];
result = 0;
for (d = 0; d < 10; d++) {
result += ds_count(i + 1, sum + d);
}
memo[i, sum] = result;
return result;
}
This is very fast and has no hard limitation like the factorial solution. It's also much easier to impelment, because its a recursive enumeration at heart.
It is interesting to note that my solution isn't suited to memoizing. (Except for memoizing the factorials, but that isn't the limiting factor here.) The point of the zig-zag count-set generation is to make only unique recursive calls. There's also a state, namely the digit set, which makes memoizing harder.