4

I am writing a program in which I am required to check if certain large numbers (permutations of cubes) are cubic (equal to n^3 for some n).

At the moment I simply use the method

static boolean isCube(long input) {
    double cubeRoot = Math.pow(input,1.0/3.0);
    return Math.round(cubeRoot) == cubeRoot;
}

but this is very slow when working with large numbers (10+ digits). Is there a faster way to determine if integer numbers are cubes?

David Eisenstat
  • 64,237
  • 7
  • 60
  • 120
konewka
  • 620
  • 8
  • 21
  • Does Math.cbrt(input) is faster then Math.pow(input,1.0/3.0); ? – JFPicard Aug 14 '15 at 19:31
  • I should have included that indeed, that's what I went with before, and took approximately the same time – konewka Aug 14 '15 at 19:33
  • 2
    A somewhat orthogonal suggestion would be to profile your program and check whether `isCube` is indeed a bottleneck. Sounds like permuting the digits, or converting the number to long or double, may happen to be actually slower than checking for the number being a cube. – Gassa Aug 14 '15 at 20:41
  • Indeed, I had checked, both were a bottleneck. I am now focusing on improving the permutation part – konewka Aug 14 '15 at 20:43

5 Answers5

7

There are only 2^21 cubes that don't overflow a long (2^22 - 1 if you allow negative numbers), so you could just use a HashSet lookup.

David Eisenstat
  • 64,237
  • 7
  • 60
  • 120
  • That's going to cost you a cache miss for nearly every call, unless you're calling this a *lot*, or with similar values every time. It turns out there's an integer algorithm for integer-cube-roots, from the Hacker's Delight. If high performance is worth spending time (and memory, and either precomputation or embedded-data size) on, then see my answer. But +1 because this is simple. – Peter Cordes Aug 15 '15 at 09:19
7

The Hacker's Delight book has a short and fast function for integer cube roots which could be worth porting to 64bit longs, see below.


It appears that testing if a number is a perfect cube can be done faster than actually computing the cube root. Burningmath has a technique that uses the "digital root" (sum the digits. repeat until it's a single digit). If the digital root is 0, 1 or 8, your number might be a perfect cube.

This method could be extremely valuable for your case of permuting (the digits of?) numbers. If you can rule out a number by its digital root, all permutations are also ruled out.

They also describe a technique based on the prime factors for checking perfect cubes. This looks most appropriate for mental arithmetic, as I think factoring is slower than cube-rooting on a computer.

Anyway, the digital root is quick to computer, and you even have your numbers as a string of digits to start with. You'll still need a divide-by-10 loop, but your starting point is the sum of digits of the input, not the whole number, so it won't be many divisions. (Integer division is about an order of magnitude slower than multiplication on current CPUs, but division by a compile-time-constant can be optimized to multiply+shift with a fixed-point inverse. Hopefully Java JIT compilers use that, too, and maybe even use it for runtime constants.)

This plus A. Webb's test (input % 819 -> search of a table of 45 entries) will rule out a lot of inputs as not possible perfect cubes. IDK if binary search, linear search, or hash/set would be best.

These tests could be a front-end to David Eisenstat's idea of just storing the set of longs that are perfect cubes in a data structure that allows quick is-present checks. (e.g. HashSet). Yes, cache misses are expensive enough that at least the digital-root test is probably worth it before doing a HashSet lookup, maybe both.

You could use less memory on this idea by using it for a Bloom Filter instead of an exact set (David Ehrman's suggestion). This would give another candidate-rejection frontend to the full calculation. The guavac BloomFilter implementation requires a "funnel" function to translate objects to bytes, which in this case should be f(x)=x).

I suspect that Bloom filtering isn't going to be a big win over an exact HashSet check, since it requires multiple memory accesses. It's appropriate when you really can't afford the space for a full table, and what you're filtering out is something really expensive like a disk access.

The integer cube root function (below) is probably faster than a single cache miss. If the cbrt check is causing cache misses, then probably the rest of your code will suffer more cache misses too, when its data is evicted.


Math.SE had a question about this for perfect squares, but that was about squares, not cubes, so none of this came up. The answers there did discuss and avoid the problems in your method, though. >.<

There are several problems with your method:


FP cube root, and then testing the resulting integer, avoids all the problems that the FP comparison introduced:

static boolean isCube(long input) {
    double cubeRoot = Math.cbrt(input);
    long intRoot = Math.round(cubeRoot);
    return (intRoot*intRoot*intRoot) == input;
}

(After searching around, I see other people on other stackoverflow / stackexchange answers suggesting that integer-comparison method, too.)

If you need high performance, and you don't mind having a more complex function with more source code, then there are possibilities. For example, use a cube-root successive-approximation algorithm with integer math. If you eventually get to a point where n^3 < input <(n+1)^3, theninput` isn't a cube.

There's some discussion of methods on this math.SE question.

I'm not going to take the time to dig into integer cube-root algorithms in detail, as the cbrt part is probably not the main bottleneck. Probably input parsing and string->long conversion is a major part of your bottleneck.

Actually, I got curious. Turns out there is already an integer cube-root implementation available in Hacker's Delight (use / copying / distributing even without attribution is allowed. AFAICT, it's essentially public domain code.):

// Hacker's delight integer cube-root (for 32-bit integers, I think)
int icbrt1(unsigned x) {
   int s;
   unsigned y, b;

   y = 0;
   for (s = 30; s >= 0; s = s - 3) {
      y = 2*y;
      b = (3*y*(y + 1) + 1) << s;
      if (x >= b) {
         x = x - b;
         y = y + 1;
      }
   }
   return y;
}

That 30 looks like a magic number based on the number of bits in an int. Porting this to long would require testing. (Also note that this is C, but looks like it should compile in Java, too!)

IDK if this is common knowledge among Java people, but the 32bit Windows JVM doesn't use the server JIT engine, and doesn't optimize your code as well.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • Thanks for the awesome response. I've gone now from several minutes down to 2 seconds (this is Project Euler #62, btw). The major thing I discovered regarding the permutations is that instead of taking all permutations for a long integer and then iterating through a huge list, it is much quicker to see if two long integers are anagrams of each other. I'm working on doing this using bitwise operations, which will hopefully shave more time off what I am doing. – horcle_buzz Apr 03 '18 at 01:27
  • `[pre-check digital root] could be extremely valuable for your case of permuting (the digits of?) numbers` should be pointless for the case of `permutations of cubes` (if permuting the digits of the decimal representation indeed is what `permutations` is supposed to be in the question). Otherwise, it precludes three cases out of five - before even converting characters to numbers. – greybeard Apr 03 '18 at 18:55
  • @greybeard, I did that and also abandoned my idea of only using anagram matching (I realized I was excluding a lot of potential hits). Even with `pre-check digital root` I am getting a `Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded` error when computing the permutations of these `long` integers. This is certainly a computationally expensive problem. I may end up trying a bloom filter. Never heard of it till I came across this thread. Interesting concept. Oh, and yes, that is what `permutations` is (so, even excluding 3/5ths, there is still a boatload). – horcle_buzz Apr 05 '18 at 02:16
  • What I said was backwards: I was first computing all permutations of a given `long int`, and then filtering them out of the list based on the `pre-check digital root` and if the values were less than the current `n` being tested, and if they were already in the list of cubes. No wonder this crapped out on memory. – horcle_buzz Apr 05 '18 at 17:22
6

You can first eliminate a large number of candidates by testing modulo given numbers. For example, a cube modulo the number 819 can only take on the following 45 values.

 0  125  181  818  720  811  532  755  476
 1  216   90  307  377  694  350  567  442
 8  343  559  629  658  351  190   91  469
27  512  287  252  638  118  603  161  441
64  729   99  701  792  378  260  468  728

So, you could eliminate actually having to compute the cubic root in almost 95% of uniformly distributed cases.

A. Webb
  • 26,227
  • 1
  • 63
  • 95
  • sorry, may be silly question, where does 819 come from? – Stas Apr 02 '16 at 14:03
  • 1
    @Stas I suspect I just searched for it, e.g. in R: `which.min(lapply(1:1024,function(x) length(unique((1:x)^3 %% x))/x))` returns 819. We want the lowest percentage of unique remainders with the remainders able to fit in a fairly small table. – A. Webb Apr 02 '16 at 16:08
  • 2
    @Stas 819 = 9*7*13, and so Z/819 = Z/9 x Z/7 x Z/13. These primes were chosen because in general, the multiplicative group of Z/p^n is cyclic of order (p-1)*p^{n-1}. If that order is divisible by 3, then only 1/3 of those elements can be cubes. So the only way this helps is if n=2, p=3 (any higher n doesn't help you), or p == 1 (mod 3). The first two primes == 1 (mod 3) are 7 and 13. If you wanted an even better percentage, you could multiply by 19 (the next prime == 1 (mod 3)). – Kevin Ventullo Jul 07 '20 at 00:06
0

The hackers delight routine seems to work on long numbers if you just change int to long and 30 to 60. If you change 30 to 61 it does not seem to work.

I didn't really understand the program, so I made another version that seems to work in Java.

private static int cubeRoot(long n) {
    final int MAX_POWER = 21;
    int power = MAX_POWER;
    long factor;
    long root = 0;
    long next, square, cube;

    while (power >= 0) {
        factor = 1 << power;
        next = root + factor;
        while (true) {
            if (next > n) {
                break;
            }
            if (n / next < next) {
                break;
            }
            square = next * next;
            if (n / square < next) {
                break;
            }
            cube = square * next;
            if (cube > n) {
                break;
            }
            root = next;
            next += factor;
        }
        --power;
    }
    return (int) root;
}
-1

Please define very show. Here is a test program:

public static void main(String[] args) {
    for (long v = 1; v > 0; v = v * 10) {
        long start = System.nanoTime();
        for (int i = 0; i < 100; i++)
            isCube(v);
        long end = System.nanoTime();
        System.out.println(v + ": " + (end - start) + "ns");
    }
}
static boolean isCube(long input) {
    double cubeRoot = Math.pow(input,1.0/3.0);
    return Math.round(cubeRoot) == cubeRoot;
}

Output is:

1: 290528ns
10: 46188ns
100: 45332ns
1000: 46188ns
10000: 46188ns
100000: 46473ns
1000000: 46188ns
10000000: 45048ns
100000000: 45048ns
1000000000: 44763ns
10000000000: 45048ns
100000000000: 44477ns
1000000000000: 45047ns
10000000000000: 46473ns
100000000000000: 47044ns
1000000000000000: 46188ns
10000000000000000: 65291ns
100000000000000000: 45047ns
1000000000000000000: 44477ns

I don't see a performance impact of "large" numbers.

Andreas
  • 154,647
  • 11
  • 152
  • 247
  • While helpful, this doesn't qualify as an 'answer'. It would be better to use an online-compiler to demonstrate your code and output, and share a link with a comment. – CubeJockey Aug 14 '15 at 19:38
  • 1
    I have a suspicion that the calls to `isCube` are just being optimized out, since nothing is happening with the results. – resueman Aug 14 '15 at 19:45
  • @resueman Yeah, you're right, that eventually happens. If you change the inner loop to 10000, the performance of 0 and 10 is high and all the rest are 0ns. – Andreas Aug 14 '15 at 19:55
  • @Andreas: first couple iterations are probably slow due to any number of warm-up effects, like CPU idle vs. turbo frequency, caches, JIT. You didn't say which JVM, either. 64bit Oracle and OpenJDK JVMs default to `server` VM, which is more aggressive with JIT and will probably optimize away the loop. 44us is a LOT of CPU cycles though; Hopefully `nanoTime()` doesn't have that much overhead. – Peter Cordes Aug 15 '15 at 07:08
  • @Andreas: the general idea of microbenchmarking which parts of the process are slow is a good idea, or better, using a profiler. This isn't a very good test, though. I expect some of the per-digit is in the string->float conversion, and `cbrt` probably has a fairly constant cost. (It has to get the result to full `double` precision, even if the result isn't an integer.) – Peter Cordes Aug 15 '15 at 09:13
  • The test for *rounded result is a natural number* is ***wrong***. `final long cr = Math.round(cubeRoot); return cr*cr*cr == input;` Also, "never" trust sub-second benchmark results. Oh, in the loop, `if (isCube(v)) c += 1;` and print `c` in result line. – greybeard Apr 03 '18 at 19:24