Tidying up a list comprehension in Haskell

Question

So I'm trying to generate a list of taxicab numbers in Haskell. Taxicab numbers are numbers that can be written as the sum of two distinct cubes in two different ways - the smallest is 1729 = 1^3 + 12^3 = 9^3 + 10^3.

For now, I'm just bothered about generating the four numbers that "make up" the taxicab number e.g. (1,12,9,10), and have been told to use a list comprehension (which I'm not too familiar with). This function will generate all 4-tuples where the largest number is at most n:

taxi n = [(a,b,c,d) | a <- [1..n], b <- [1..n], c <- [1..n], d <- [1..n], a^3 + b^3 == c^3 + d^3, a < b, a < c, c < d]

However, it is troublesome for a few reasons:

The domain for a,b,c,d are all identical but I can't work out how to simplify this code so [1..n] is only written once.
I want an infinite list, without an upper limit, so I can just leave the program running for as long as possible and terminate it when I please. Clearly if I just set a <- [1..] etc. then the program will never end up evaluating anything.
The program is very slow: just taxi 50 takes 19 seconds.

Any speed optimisations would also be good, but if not the naive method I'm using is sufficient.

Who told you to use list comprehension? How are you benchmarking this? Please say `ghc -O2 -fforce-recomp -fllvm`. What order should the search be in? Depth first? Why would nothing be evaluated if `a` is drawn from an infinite list? — Thomas M. DuBuisson, Dec 26 '17 at 20:41
Using a list comprehension is a requirement of the question. I don't have a clue what your "ghc ..." code means; I used the command `:set +s` in Haskell, but speed is not my main concern. Any sensible order would be good for me - the problem is that I don't know how to get Haskell to evaluate in a specific order, though I could construct a bijection ℕ^4 -> ℕ that would cycle through the integers. I am aware infinite lists are necessary; I meant that replacing `a <- [1..n]` with `a <- [1..]` without further changes will cause the program to never output anything. — A.M., Dec 26 '17 at 20:49
FYI `:set +s` is not done "in Haskell" but in the GHC interpreter named GHCi. In addition to this interpreter there is a compiler named ghc which can produce machine code that runs much faster. The use of list comprehension locks you in to something of a breadth first exploration of values for the first list, considering all possible values for later listed variables before considering the next value for the first one. This is easy and educational to play with using statements such as `[ (a,b,c) | a <- [1..3], b <- [1,2], c <- [1,2] ]`. — Thomas M. DuBuisson, Dec 27 '17 at 00:41
@Soldalma not at all. Consider (1,12,9,10). The inequality conditions are only there to avoid also getting the other 23 permutations of this tuple e.g. (9,10,1,12) or (12,1,9,10). Within each pair, we want the lower number first and of the pairs, we want the one with the smallest number first (of course, this is only one possible solution). — A.M., Dec 28 '17 at 21:18

Stefan Pochmann · Accepted Answer · 2017-12-27T05:41:11.797

13

Your constraints imply a < c < d < b. So let b run outermost and let the others run in appropriate lower ranges:

taxi n = [ (a,b,c,d) | b <- [1..n],
                       d <- [1..b-1],
                       c <- [1..d-1],
                       a <- [1..c-1],
                       a^3 + b^3 == c^3 + d^3 ]

To go infinite, simply use b <- [1..].

A further big improvement is to compute one of the four variables from the other three:

taxi = [ (a,b,c,d) | b <- [1..],
                     c <- [1..b-1],
                     a <- [1..c-1],
                     let d3 = a^3 + b^3 - c^3,
                     let d = round(fromIntegral(d3)**(1/3)),
                     c < d,
                     d^3 == d3 ]

Benchmarking taxi 50 in GHCi with :set +s like you did:

Yours:          (16.49 secs, 17,672,510,704 bytes)
My first:        (0.65 secs,    658,537,184 bytes)
My second:       (0.09 secs,     66,229,376 bytes)  (modified to use b <- [1..n] again)
Daniel's first:  (1.94 secs,  2,016,810,312 bytes)
Daniel's second: (2.87 secs,  2,434,309,440 bytes)

edited Dec 27 '17 at 05:41

answered Dec 26 '17 at 20:59

Stefan Pochmann

27,593
8
44
107

Ahhh, that's so simple! Thank you. One question: how did you get those ranges for the list? I deduced `b <- [1..n], a <- [1..b-3], c <- [a+1..b-2], d <- [c+1..b-1]` since the four integers will all be distinct. – A.M. Dec 26 '17 at 21:13
@A.Morris Yeah, you can do that. Doesn't really help, though. You just avoid some irrelevantly tiny dead ends, at the cost of having to think and increasing your chance to make a mistake. – Stefan Pochmann Dec 26 '17 at 21:19
2

Love this solution, much cleaner than mine. You might also like [exactCubeRoot](http://hackage.haskell.org/package/arithmoi-0.6.0.1/docs/Math-NumberTheory-Powers-Cubes.html#v:exactCubeRoot). (Probably won't matter within the amount of time you're willing to wait, of course.) – Daniel Wagner Dec 26 '17 at 22:23
@DanielWagner Thanks, I might try that later. I just started learning Haskell, have no idea yet how to use that. – Stefan Pochmann Dec 26 '17 at 22:27
@A.Morris I just changed the order of my variables, completely going from largest to smallest. Now the ranges all start at 1, looks much nicer. I even considered to instead start them at 4, 3, 2 and 1, as it's now so orderly. But I still prefer the simplicity of all just starting at 1. – Stefan Pochmann Dec 27 '17 at 06:10

Daniel Wagner · Answer 2 · 2017-12-26T20:58:03.503

The domain for a,b,c,d are all identical but I can't work out how to simplify this code so [1..n] is only written once.

Use [a,b,c,d] <- replicateM 4 [1..n].

The program is very slow: just taxi 50 takes 19 seconds.

One cheap improvement is to bake your a<b, a<c, and c<d conditions into the comprehension.

taxi n = [(a,b,c,d) | a <- [1..n], b <- [a+1..n], c <- [a+1..n], d <- [c+1..n], a^3 + b^3 == c^3 + d^3]

This makes things significantly faster on my machine.

Or, for better composability with the next (and previous) part of the answer, treat b, c, and d as offsets.

taxi n =
    [ (a,b,c,d)
    | a <- [1..n]
    , b_ <- [1..n], let b = a+b_, b<=n
    , c_ <- [1..n], let c = a+c_, c<=n
    , d_ <- [1..n], let d = c+d_, d<=n
    , a^3 + b^3 == c^3 + d^3
    ]

I want an infinite list, without an upper limit.

See my answer to Cartesian product of 2 lists in Haskell for a hint. tl;dr use choices.

@Soldalma Because the original question specifies `a<` twice and no `b<`. — Daniel Wagner, Dec 27 '17 at 17:16

Elmex80s · Answer 3 · 2018-07-04T07:43:21.397

Picking up Stefan his excellent answer. Given a^3 + b^3 == c^3 + d^3 we only have to look at integers for which holds 0 < a < c < b. Now introduce this (infinite) iterative structure

-- generates all integers x, y and z for which holds 0 < x < y < z
triplets = [(x, y, z) | z <- [3 .. ], y <- [2 .. z - 1], x <- [1 .. y - 1]]

This will give us triplets easy accessible from our list comprehension which we will present here after. For people with a Python background, this should be equivalent to Python's yield.

Next we need something to (quickly) find the largest cube and test for integers to be cube, also called integer cube root. There is this package Math.NumberTheory.Powers.Cubes which has functions for these tasks. Or just use these

-- given integer x >= 0 find the largest integer r such that r^3 <= x
largestCube :: Integral a => a -> a
largestCube x = 
    let powers_of_two = iterate ((*) 2) 1
        upper         = head [j | j <- powers_of_two, x < j ^ 3]
    in  largestCubeSub 0 upper x


largestCubeSub :: Integral a => a -> a -> a -> a
largestCubeSub lower upper x
    | lower + 1 == upper   = lower
    | b ^ 3 <= x           = largestCubeSub b upper x
    | otherwise            = largestCubeSub lower b x
    where
    b = div (lower + upper) 2


-- test if an integer x >= 0 is a cube
isCube :: Integral a => a -> Bool
isCube x = (largestCube x) ^ 3 == x

Now your compact list comprehension for the first 50 taxicab numbers looks like

*Main> condition = \a b c -> and [isCube (a^3 + b^3 - c^3), a^3 + b^3 - c^3 > c^3]
*Main> taxi = [(a, b, c, largestCube (a^3 + b^3 - c^3)) | (a, c, b) <- triplets, condition a b c]
*Main> first50 = take 50 taxi

Printing them using

*Main> single_line = \(x, y, z, u) -> unwords [show i | i <- [x, y, z, u]]
*Main> putStrLn $ unlines $ map single_line first50

Will give

1 12 9 10
2 16 9 15
2 24 18 20
10 27 19 24
4 32 18 30
2 34 15 33
9 34 16 33
3 36 27 30
17 39 26 36
12 40 31 33
6 48 27 45
4 48 36 40
12 51 38 43
8 53 29 50
20 54 38 48
17 55 24 54
9 58 22 57
3 60 22 59
5 60 45 50
8 64 36 60
30 67 51 58
4 68 30 66
18 68 32 66
42 69 56 61
6 72 54 60
17 76 38 73
5 76 48 69
34 78 52 72
10 80 45 75
15 80 54 71
24 80 62 66
30 81 57 72
51 82 64 75
7 84 63 70
2 89 41 86
11 93 30 92
23 94 63 84
12 96 54 90
50 96 59 93
8 96 72 80
20 97 33 96
47 97 66 90
35 98 59 92
24 98 63 89
29 99 60 92
6 102 45 99
27 102 48 99
23 102 60 95
24 102 76 86
1 103 64 94

Within a few seconds it returns the first 50 Taxi-cab numbers.

Tidying up a list comprehension in Haskell

3 Answers3