Memoization Efficiency Problems (Collatz Hailstone Sequence)

Question

I was particularly interested over the last few days (more from an algorithmic than mathematical perspective) in investigating the length of a given number's Hailstone sequence (Collatz conjecture). Implementing a recursive algorithm is probably the simplest way to calculate the length, but seemed to me like an unnecessary waste of calculation time. Many sequences overlap; take for example 3's Hailstone sequence:

3 -> 10 -> 5 -> 16 -> 8 -> 4 -> 2 -> 1

This has length 7; more specifically, it takes 7 operations to get to 1. If we then take 6:

6 -> 3 -> ...

We notice immediately that we've already calculated this, so we just add on the sequence length of 3 instead of running through all those numbers again, considerably reducing the number of operations required to calculate the sequence length of each number.

I tried to implement this in Java using a HashMap (seemed appropriate given O(1) probabilistic get/put complexity):

import java.util.HashMap;

/* NOTE: cache.put(1,0); is called in main to act as the
 * 'base case' of sorts. 
 */

private static HashMap<Long, Long> cache = new HashMap<>();

/* Returns length of sequence, pulling prerecorded value from
 * from cache whenever possible, and saving unrecorded values
 * to the cache.
 */
static long seqLen(long n) {
    long count = 0, m = n;
    while (true) {
        if (cache.containsKey(n)) {
            count += cache.get(n);
            cache.put(m, count);
            return count;
        }
        else if (n % 2 == 0) {
            n /= 2;
        }
        else {
            n = 3*n + 1;
        }
        count++;
    }
}

What seqLen will essentially do is start at a given number and work through that number's Hailstone sequence until it comes across a number already in the cache, in which case it will add that on to the current value of count, and then log the value and the associated sequence length in the HashMap as a (key,val) pair.

I also had the following fairly standard recursive algorithm for comparison:

static long recSeqLen(long n) {
    if (n == 1) {
        return 0;
    }
    else if (n % 2 == 0) {
        return 1 + recSeqLen(n / 2);
    }
    else return 1 + recSeqLen(3*n + 1);
}

The logging algorithm should, by all accounts, run quite a bit quicker than the naive recursive method. However in most cases, it doesn't run that much faster at all, and for larger inputs, it actually runs slower. Running the following code yields times that vary considerably as the size of n changes:

long n = ... // However many numbers I want to calculate sequence
             // lengths for.

long st = System.nanoTime();
// Iterative logging algorithm
for (long i = 2; i < n; i++) {
    seqLen(i);
}
long et = System.nanoTime();
System.out.printf("HashMap algorithm: %d ms\n", (et - st) / 1000000);

st = System.nanoTime();
// Using recursion without logging values:
for (long i = 2; i < n; i++) {
    recSeqLen(i);
}
et = System.nanoTime();
System.out.printf("Recusive non-logging algorithm: %d ms\n",
                    (et - st) / 1000000);

n = 1,000: ~2ms for both algorithms
n = 100,000: ~65ms for Iterative logging, ~75ms for Recursive non-logging
n = 1,000,000: ~500ms and ~900ms
n = 10,000,000: ~14,000ms and ~10,000ms

At higher values I get memory errors, so I can't check if the pattern continues.

So my question is: why does the logging algorithm suddenly begin to take longer than the naive recursive algorithm for large values of n?

EDIT:

Scrapping HashMaps altogether and opting for a simple array structure (as well as removing part of the overhead of checking whether a value is in the array or not) produces the desired efficiency:

private static final int CACHE_SIZE = 80000000;
private static long[] cache = new long[CACHE_SIZE];

static long seqLen(long n) {
    int count = 0;
    long m = n;

    do {
        if (n % 2 == 0) {
            n /= 2;
        }
        else {
            n = 3*n + 1;
        }
        count++;
    } while (n > m);

    count += cache[(int)n];
    cache[(int)m] = count;
    return count;
}

Iterating over the entire cache size (80 million) now takes a mere 3 seconds, as opposed to 93 seconds using the recursive algorithm. The HashMap algorithm throws memory error, so it can't even be compared, but given it's behaviour at lower values, I have a feeling it wouldn't compare well.

Similar question in Prolog: http://stackoverflow.com/questions/30026151/uneven-tabling-performance-in-bprolog-8-1 — , May 15 '16 at 20:56
Part of the problem is little (re-)use of cache entries for large (parameter) values: caching "odd results", only, determining the first (half) million (odd) lengths enters 293698 lengths to my cache for parameters > 1e6, of which 9138 get used, 73 for the maximum of twice. Got curious about 8e7: 23741549 entries ">8e7", 729540 re-used, 6077 twice. — greybeard, Nov 08 '16 at 00:48

score 1 · Answer 1 · answered Oct 29 '15 at 02:35

1

Off the cuff, I'd guess it's spending a lot of time reallocating the hash map. Sounds like you're starting it off empty and keep adding stuff to it. That means as it grows in size, it will need to allocate a bigger chunk of memory to store your data, and recompute the hash for all elements, which is O(N). Try pre-allocating the size to what you expect to put in there. See https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html for more discussion.

answered Oct 29 '15 at 02:35

Mark Tozzi

10,353
5
22
30

I did indeed initialize my HashMap with default values (16 capacity and 0.75 load), however changing these values seems to make little to no difference to the speed of the algorithm. – SilverSylvester Oct 29 '15 at 10:19
@SilverSylvester: I was able to reproduce your results, and I agree that by 10M, the cache starts to perform worse. I thought maybe you were getting a lot of cache misses, so I wrote a version that cached even more, and it performed even worse. My best guess at this point is that for large N, the sequence is so sparse that you don't get enough cache hits to pay for the overhead. You can see my attempts here: https://gist.github.com/not-napoleon/47a2baece1f23678aad3 – Mark Tozzi Oct 29 '15 at 20:14
I think you may be right, there is too much overhead. I've added an edit to my original question which outlines an algorithm that works as intended, without using a HashMap structure at all. In hindsight, a HashMap was probably not the best choice of data structure. Data is logged sequentially anyway, so there was no need for O(1) access to a value bound to a certain key, I can just let the array index stand for the key and get O(1) access. – SilverSylvester Oct 29 '15 at 21:02

Memoization Efficiency Problems (Collatz Hailstone Sequence)

EDIT:

1 Answers1

Linked