Haskell ways to the 3n+1 challenge

Question

Here is a simple programming problem from SPOJ: http://www.spoj.com/problems/PROBTRES/.

Basically, you are asked to output the biggest Collatz cycle for numbers between i and j. (Collatz cycle of a number $n$ is the number of steps to eventually get from $n$ to 1.)

I have been looking for a Haskell way to solve the problem with comparative performance than that of Java or C++ (so as to fits in the allowed run-time limit). Although a simple Java solution that memoizes the cycle length of any already computed cycles will work. I haven't been successful at applying the idea to obtain a Haskell solution.

I have tried the Data.Function.Memoize, as well as home-brewed log time memoization technique using the idea from this post: Memoization in Haskell?. Unfortunately, memoization actually makes the computation of cycle(n) even slower. I believe the slow down comes from the overhead of haskell way. (I tried running with the compiled binary code, instead of interpreting.)

I also suspect that simply iterating numbers from i to j can be costly ($i,j\le10^6$). So I even tried precompute everything for the range query, using idea from http://blog.openendings.net/2013/10/range-trees-and-profiling-in-haskell.html. However, this still gives "Time Limit Exceeding" error.

Can you help to inform a neat competitive Haskell program for this?

Thanks!

Isn't that essentially the same as [Project Euler 14](https://projecteuler.net/problem=14)? In this case, you might be interested [in one of my older answers](http://stackoverflow.com/q/22416292/1139697). — Zeta, Feb 16 '16 at 20:47
This problem is why I wrote [`data-memocombinators`](https://hackage.haskell.org/package/data-memocombinators). If you only memoize the cycle length function for inputs below a threshold (e.g. using `arrayRange`) it works very well. — luqui, Feb 16 '16 at 20:47
Perhaps sharing your code will enable us to help you more effectively. — Emil, Feb 16 '16 at 21:57

behzad.nouri · Answer 1 · 2016-02-18T01:39:28.470

>>> using the approach bellow, I could submit an accepted answer to SPOJ. You may check the entire code from here.

The problem has bounds 0 < n < 1,000,000. Pre-calculate all of them and store them inside an array; then freeze the array. The array can be used as its own cache / memoization space.

The problem would then reduce to a range query problem over an array, which can be done very efficiently using trees.

With the code bellow I can get Collatz of 1..1,000,000 in a fraction of a second:

$ time echo 1000000 | ./collatz 
525

real    0m0.177s
user    0m0.173s
sys     0m0.003s

Note that collatz function below, uses mutable STUArray internally, but itself is a pure function:

import Control.Monad.ST (ST)
import Control.Monad (mapM_)
import Control.Applicative ((<$>))
import Data.Array.Unboxed (UArray, elems)
import Data.Array.ST (STUArray, readArray, writeArray, runSTUArray, newArray)

collatz :: Int -> UArray Int Int
collatz size = out
    where
    next i = if odd i then 3 * i + 1 else i `div` 2

    loop :: STUArray s Int Int -> Int -> ST s Int
    loop arr k
        | size < k  = succ <$> loop arr (next k)
        | otherwise = do
            out <- readArray arr k
            if out /= 0 then return out
            else do
                out <- succ <$> loop arr (next k)
                writeArray arr k out
                return out

    out = runSTUArray $ do
        arr <- newArray (1, size) 0
        writeArray arr 1 1
        mapM_ (loop arr) [2..size]
        return arr

main = do
    size <- read <$> getLine
    print . maximum . elems $ collatz size

In order to perform range queries on this array, you may build a balanced tree as simple as below:

type Range = (Int, Int)
data Tree  = Leaf Int | Node Tree Tree Range Int

build_tree :: Int -> Tree
build_tree size = loop 1 cnt
    where
    ctz = collatz size
    cnt = head . dropWhile (< size) $ iterate (*2) 1

    (Leaf a)       +: (Leaf b)       = max a b
    (Node _ _ _ a) +: (Node _ _ _ b) = max a b

    loop lo hi
        | lo == hi  = Leaf $ if size < lo then minBound else ctz ! lo
        | otherwise = Node left right (lo, hi) (left +: right)
        where
        i = (lo + hi) `div` 2
        left  = loop lo i
        right = loop (i + 1) hi

query_tree :: Tree -> Int -> Int -> Int
query_tree (Leaf x) _ _ = x
query_tree (Node l r (lo, hi) x) i j
    | i <= lo && hi <= j = x
    | mid < i       = query_tree r i j
    | j   < 1 + mid = query_tree l i j
    | otherwise     = max (query_tree l i j) (query_tree r i j)
    where mid = (lo + hi) `div` 2

Thanks for the answer. But I have tried submit your code (both version), SPOJ won't accept it, though... — haskell looks great, Apr 15 '16 at 06:03
@haskelllooksgreat i just checked that the code i have posted on pastebin [(here)](http://pastie.org/pastes/10726632/text) does in fact get accepted by spoj — behzad.nouri, Apr 15 '16 at 06:42
I re-submitted a second time, and it indeed got accepted (resulting in 0.42 second runtime)! — haskell looks great, Apr 15 '16 at 12:09

effectfully · Answer 2 · 2016-02-17T06:28:37.487

Here is the same as in the other answer, but with an immutable recursively defined array (and it also leaks slightly (can someone say why?) and so two times slower):

import Data.Array

upper = 10^6

step :: Integer -> Int
step i = 1 + colAt (if odd i then 3 * i + 1 else i `div` 2)

colAt :: Integer -> Int
colAt i | i > upper = step i
colAt i = col!i

col :: Array Integer Int
col = array (1, upper) $ (1, 1) : [(i, step i) | i <- [2..upper]]

main = print $ maximum $ elems col

Haskell ways to the 3n+1 challenge

2 Answers2