How can I optimize this code in Haskell

Question

I need your help about optimization. Here is the question:

What is the smallest (strictly positive) natural number which can not be obtained using 10 times the number 3, combined with the usual arithmetic operators?

And here is my solution:

import Ratio

num x 1 = [x]
num x n = num'' $ concat $ [ num' a b | i <- [1..n`quot`2], a <- num x i, b <- num x (n-i) ]
  where  num' a b | a==0      = [b] 
                  | b==0      = [a]
                  | otherwise = [a+b,abs(a-b),a*b,a/b,b/a]
         num'' (x:r) = x : num'' (filter (x/=) r)
         num'' _      = []

cint x = map numerator . filter ((1==) . denominator) . num x

firstNumber x n = take 1 [ i | i <- [1..], i `notElem` (cint x n) ]

But it is so inefficient. For example, when I call firstNumber 3 7, it takes nearly 30 second to see the result. But, I haven't seen the result of firstNumber 3 10 though I have waited for a long time. How can I optimize it?

Have you compiled it with optimizations? Also, could you explain the problem a bit more? What do you mean by "the first number which can not be obtained using 10 times of 3?" Specifically, I don't understand what "10 times of 3" is supposed to mean. — bheklilr, Jun 04 '14 at 21:15
A tip: you have `num''` (not a very good function name IMO) that essentially ensures that there are no duplicates in your list. Why not use `Data.Set`? It's a much more efficient implementation of an unordered collection of unique elements, since internally it's implemented as a binary tree. You could probably save a lot of time just with `num'' = Data.Set.toList . Data.Set.fromList`, unless `num''` is meant to accept infinite streams. Since it looks like you have a finite search space this should be safe. — bheklilr, Jun 04 '14 at 21:23
You will use 3,3,3,3,3,3,3,3,3,3 (10 times of 3). And four operands (+,-,*,/) between each 3. Think about all possibilities. For example, when you call firstNumber 3 6, you obtain 1,2,3,...31 but not 32. Understand now? — sasas, Jun 04 '14 at 21:26
Yes, that makes it more clear. I thought that was what you were going for, but I wanted to clarify before moving forward. — bheklilr, Jun 04 '14 at 21:27
How's that : "What is the smallest (strictly positive?) natural number which can not be obtained using 10 times the number 3, combined with the usual arithmetic operators" ? — didierc, Jun 04 '14 at 21:50
@didierc I think I would prefer "What is the smallest natural number which can not be obtained by using the usual arithmetic operations 10 times with only the number 3?" — bheklilr, Jun 04 '14 at 21:53
@bheklilr, that would be "using the usual arithmetic operations 9 times." The formula would have ten threes and nine operators. — Heatsink, Jun 04 '14 at 21:54
You are not looking for the smallest number? Then what is the ordering so as to get the correct "first" one? Besides your list comprehension definition follows that ordering. So, what is not your goal? — didierc, Jun 04 '14 at 22:13

Heatsink · Answer 1 · 2014-06-04T23:51:13.087

We can approach the problem in terms of generating sets of partial answers: num k 1 is the set {k}, num 2 is the set of all numbers generated by combining num k 1 with num k 1, num k 3 is the set of all numbers generated by combining num k 1 with num k 2, and so forth. Each step uses the sets computed in previous steps, and applies one operator. Here are the first 3 steps. Note that each number is computed using two previously generated numbers and one operator.

num 3 1 = {3}
num 3 2 = {3-3=0, 3/3=1, 3=3, 3+3=6, 3*3=9}
num 3 3 = {3-9=-6, 3-6=-3, 1-3=-2, 0=0, 1/3=1/3, 3/6=1/2, 1=1, 6/3=2, 3=3, 3+1=4, 6=6, 9=9, 3+9=12, 3*6=18, 3*9=27}

Your list-based num function is recomputing previous steps for two reasons.

Step n recomputes all previous steps from scratch. For instance, num x 4 will compute num x 1, num x 2, and num x 3. Then, num x 3 will compute num x 1 and num x 2 again.
There is a recursive call to num in an inner loop. Specifically, you have [... | ... a <- num x i, b <- num x (n-i) ]. This will recompute num x (n-i) for each value of a. You can move the recursive call out of the inner loop by writing [... | ... let b_input = num x (n-i), a <- num x i, b <- b_input]. (Compiler optimizations may do this automatically, but you shouldn't rely on it.)

Instead of recomputing previous results, you should save and reuse them. The easiest way to do this is to save a list of previous results as the algorithm proceeds. Converting your code to save previous results is an instance of a more general technique known as memoization.

Another source of inefficiency is num'', which searches the entire list to remove duplicates in quadratic time. Duplicates cam be removed in n*log(n) time using sets from the Data.Set module.

In summary, in num k n, do not recursively call num because this will do redundant work. Instead of recursive calls, save the list of results from num k 1, num k 2, ... num k (n-1) and pass this list to num k n. Also, use the Data.Set module to remove duplicate values instead of calling num''.

How will I use the Data.Set module to remove duplicate values instead of calling num'' ? Can you give an example please? — sasas, Jun 05 '14 at 07:37
@sasas You can see some example implementations in the answers of http://stackoverflow.com/q/16108714/507803. — Heatsink, Jun 05 '14 at 12:44
@sasas If you need more detailed guidance with Haskell or algorithms, you can ask on the freenode IRC channel. I think IRC is a better medium for teaching. — Heatsink, Jun 05 '14 at 19:35

How can I optimize this code in Haskell

1 Answers1