3

Here is a simple programming problem from SPOJ: http://www.spoj.com/problems/PROBTRES/.

Basically, you are asked to output the biggest Collatz cycle for numbers between i and j. (Collatz cycle of a number $n$ is the number of steps to eventually get from $n$ to 1.)

I have been looking for a Haskell way to solve the problem with comparative performance than that of Java or C++ (so as to fits in the allowed run-time limit). Although a simple Java solution that memoizes the cycle length of any already computed cycles will work. I haven't been successful at applying the idea to obtain a Haskell solution.

I have tried the Data.Function.Memoize, as well as home-brewed log time memoization technique using the idea from this post: Memoization in Haskell?. Unfortunately, memoization actually makes the computation of cycle(n) even slower. I believe the slow down comes from the overhead of haskell way. (I tried running with the compiled binary code, instead of interpreting.)

I also suspect that simply iterating numbers from i to j can be costly ($i,j\le10^6$). So I even tried precompute everything for the range query, using idea from http://blog.openendings.net/2013/10/range-trees-and-profiling-in-haskell.html. However, this still gives "Time Limit Exceeding" error.

Can you help to inform a neat competitive Haskell program for this?

Thanks!

Community
  • 1
  • 1
  • 1
    Isn't that essentially the same as [Project Euler 14](https://projecteuler.net/problem=14)? In this case, you might be interested [in one of my older answers](http://stackoverflow.com/q/22416292/1139697). – Zeta Feb 16 '16 at 20:47
  • 3
    This problem is why I wrote [`data-memocombinators`](https://hackage.haskell.org/package/data-memocombinators). If you only memoize the cycle length function for inputs below a threshold (e.g. using `arrayRange`) it works very well. – luqui Feb 16 '16 at 20:47
  • 4
    Perhaps sharing your code will enable us to help you more effectively. – Emil Feb 16 '16 at 21:57

2 Answers2

4

>>> using the approach bellow, I could submit an accepted answer to SPOJ. You may check the entire code from here.


The problem has bounds 0 < n < 1,000,000. Pre-calculate all of them and store them inside an array; then freeze the array. The array can be used as its own cache / memoization space.

The problem would then reduce to a range query problem over an array, which can be done very efficiently using trees.

With the code bellow I can get Collatz of 1..1,000,000 in a fraction of a second:

$ time echo 1000000 | ./collatz 
525

real    0m0.177s
user    0m0.173s
sys     0m0.003s

Note that collatz function below, uses mutable STUArray internally, but itself is a pure function:

import Control.Monad.ST (ST)
import Control.Monad (mapM_)
import Control.Applicative ((<$>))
import Data.Array.Unboxed (UArray, elems)
import Data.Array.ST (STUArray, readArray, writeArray, runSTUArray, newArray)

collatz :: Int -> UArray Int Int
collatz size = out
    where
    next i = if odd i then 3 * i + 1 else i `div` 2

    loop :: STUArray s Int Int -> Int -> ST s Int
    loop arr k
        | size < k  = succ <$> loop arr (next k)
        | otherwise = do
            out <- readArray arr k
            if out /= 0 then return out
            else do
                out <- succ <$> loop arr (next k)
                writeArray arr k out
                return out

    out = runSTUArray $ do
        arr <- newArray (1, size) 0
        writeArray arr 1 1
        mapM_ (loop arr) [2..size]
        return arr

main = do
    size <- read <$> getLine
    print . maximum . elems $ collatz size

In order to perform range queries on this array, you may build a balanced tree as simple as below:

type Range = (Int, Int)
data Tree  = Leaf Int | Node Tree Tree Range Int

build_tree :: Int -> Tree
build_tree size = loop 1 cnt
    where
    ctz = collatz size
    cnt = head . dropWhile (< size) $ iterate (*2) 1

    (Leaf a)       +: (Leaf b)       = max a b
    (Node _ _ _ a) +: (Node _ _ _ b) = max a b

    loop lo hi
        | lo == hi  = Leaf $ if size < lo then minBound else ctz ! lo
        | otherwise = Node left right (lo, hi) (left +: right)
        where
        i = (lo + hi) `div` 2
        left  = loop lo i
        right = loop (i + 1) hi

query_tree :: Tree -> Int -> Int -> Int
query_tree (Leaf x) _ _ = x
query_tree (Node l r (lo, hi) x) i j
    | i <= lo && hi <= j = x
    | mid < i       = query_tree r i j
    | j   < 1 + mid = query_tree l i j
    | otherwise     = max (query_tree l i j) (query_tree r i j)
    where mid = (lo + hi) `div` 2
behzad.nouri
  • 74,723
  • 18
  • 126
  • 124
2

Here is the same as in the other answer, but with an immutable recursively defined array (and it also leaks slightly (can someone say why?) and so two times slower):

import Data.Array

upper = 10^6

step :: Integer -> Int
step i = 1 + colAt (if odd i then 3 * i + 1 else i `div` 2)

colAt :: Integer -> Int
colAt i | i > upper = step i
colAt i = col!i

col :: Array Integer Int
col = array (1, upper) $ (1, 1) : [(i, step i) | i <- [2..upper]]

main = print $ maximum $ elems col
effectfully
  • 12,325
  • 2
  • 17
  • 40