How are Dynamic Programming algorithms implemented in idiomatic Haskell?

Question

Haskell and other functional programming languages are built around the premise of not maintaining state. I'm still new to how functional programming works and concepts in it, so I was wondering if it is possible to implement DP algorithms in an FP way.

What functional programming constructs can be used to do that?

The title is sort of silly -- of course the answer is "yes" (see the "Related" questions). Perhaps consider revising it to more focused (and "tamer") language. Happy functional-coding. — , Feb 12 '11 at 07:53
Functional languages discourage or prohibit *mutable/implicit* state. Haskell provides ways for you to maintain explicit state. — itsbruce, Jan 07 '14 at 13:52

luqui · Accepted Answer · 2014-01-06T22:28:29.293

22

The common way to do this is via lazy memoization. In some sense, the recursive fibonacci function can be considered dynamic programming, because it computes results out of overlapping subproblems. I realize this is a tired example, but here's a taste. It uses the data-memocombinators library for lazy memoization.

import qualified Data.MemoCombinators as Memo

fib = Memo.integral fib'
    where
    fib' 0 = 0
    fib' 1 = 1
    fib' n = fib (n-1) + fib (n-2)

fib is the memoized version, and fib' just "brute forces" the problem, but computes its subproblems using the memoized fib. Other DP algorithms are written in this same style, using different memo structures, but the same idea of just computing the result in a straightforward functional way and memoizing.

Edit: I finally gave in and decided to provide a memoizable typeclass. That means that memoizing is easier now:

import Data.MemoCombinators.Class (memoize)

fib = memoize fib'
    where
    fib' :: Integer -> Integer  -- but type sig now required 
    ...

Instaead of needing to follow the type, you can just memoize anything. You can still use the old way if you like it.

edited Jan 06 '14 at 22:28

answered Feb 12 '11 at 08:19

luqui

59,485
12
145
204

20

My interpretation of the question is "Given that memoization involves maintaining global state, how do you memoize in a purely functional language?". Saying "just use memoization" doesn't say how it actually works, which is surely what the OP is asking. – Gabe Feb 12 '11 at 08:37
Hmm, you're probably right. I'm feeling lazy, so I'll reserve an explanation for a question that asks that explicitly. (Easier for searchers to find, more likely I'm addressing the real question) – luqui Feb 12 '11 at 08:58
@Gabe use [the source](http://hackage.haskell.org/packages/archive/data-memocombinators/0.4.1/doc/html/src/Data-MemoCombinators.html#Memo) Luke! How do you do DP? Memoization. How do you do memoization? See source. But no need to reinvent the wheel unless that, specifically, is what you are interested in doing. – Dan Burton Feb 12 '11 at 18:18
3

@Dan: By your logic, nearly every answer on SO could just be reduced to "Just Google it!" or "Just read the source!", so I don't give much credence to such answers. – Gabe Feb 12 '11 at 18:44
While this is certainly a legitimate answer to the question as it stands, it seems to me that you modified the question to better fit your answer rather than answering what was originally asked. The OP doesn't seem to mind, but it's not the approach I would take. – Gabe Feb 12 '11 at 18:49
1

@Gabe [searching for Memoization in Haskell](http://stackoverflow.com/search?q=Memoization+in+Haskell) yields questions more directly related to implementing memoization. @Luqui I've [posted a question](http://stackoverflow.com/questions/4980102/how-does-data-memocombinators-work) requesting details about how this package works. I'm interested but can't quite wrap my head around it. – Dan Burton Feb 12 '11 at 19:26
@luqui using GHCi, version 7.6.3: "Could not find module `Data.MemoCombinators'" - any ideas? Do I need to install it locally? – vikingsteve Jan 05 '14 at 21:33
@vikingsteve yeah, you just need to `cabal install data-memocombinators` – luqui Jan 06 '14 at 04:37

score 18 · Answer 2 · answered Feb 14 '11 at 10:37

Rabhi and Lapalme's Algorithms: A Functional Programming Approach has a nice chapter on this which illustrates some FP concepts being put to use, namely higher order functions and lazy evaluation. I assume it's OK for me to reproduce a simplified version of their higher order function.

It's simplified in that it only works on functions that take Int as input and produce Int as output. Because we're using Int in two different ways, I'll make synonyms for them "Key" and "Value". But don't forget that because these are synonynms, it's perfectly possible to use Keys and Values and vice-versa. They're only used for readability.

type Key = Int
type Value = Int

dynamic :: (Table Value Key -> Key -> Value) -> Key -> Table Value Key
dynamic compute bnd = t
 where t = newTable (map (\coord -> (coord, compute t coord)) [0..bnd])

Let's dissect this function a little.

First, what does this function do? From the type signature we can see that it somehow manipulates tables. Indeed the first argument "compute" is a function (hence dynamic is a "higher order" function) which produces some sort of value from a table, and the second argument is just some kind of upper bound, telling us where to stop. And as output, the "dynamic" function gives us some kind of Table. If we want to get the answer to some DP-friendly problem, we run "dynamic" and then look up the answer from our Table.

To use this function to compute fibonacci, we would run it a little like this

fib = findTable (dynamic helper n) n
 where
  helper t i =
    if i <= 1
       then i
       else findTable t (i-1) + findTable t (i-2)

Don't worry too much about understanding this fib function for now. It'll become a bit clearer as we explore "dynamic".

Second, what sort of prerequisites do we need to know about to understand this function? I'll assume you're more or less familiar with the syntax, the [0..x] to indicate a list from 0 to x, the -> in type signatures like Int -> Table -> ... versus the -> in anonymous functions like \coord -> ... If you're not comfortable with these, they might get in the way.

Another prerequisite to tackle is this lookup Table. We don't want to worry about how it works, but let's assume that we can create them from lists of key-value pairs and also look up entries in them:

newTable :: [(k,v)] -> Table v k
findTable :: Table v k -> k -> v

Three things to note here:

For simplicity, we're not using the equivalent from the Haskell standard library
findTable will crash if you ask it to look up a non-existent value from the table. We can use a fancier version to avoid this if needed, but that's a subject for a different post
Strangely enough, I didn't mention any sort of "add a value to the table" function, even though the book and standard Haskell libraries provide one. Why not?

Finally, how does this function actually work? What's going on here? We can zoom in a bit on the meat of the function,

t = newTable (map (\coord -> (coord, compute t coord)) [0..bnd])

and methodically tear it apart. Going from the outside in, we've got t = newTable (...), which seems to tell us that we're building a table from some sort of list. Boring. What about the list?

map (\coord -> (coord, compute t coord)) [0..bnd]

Here we've got the higher order map function walking down a list from 0 to bnd and producing a new list as a result. To compute the new list, it's using a function \coord -> (coord, compute t coord). Keep in mind the context: we're trying to build a table from key value-pairs, so if you study the tuple, the first part coord must be the key and the second part compute t coord must be the value. That second part is where things get exciting. Let's zoom in a little further

compute t coord

We're building up a table from key-value pairs and the value we're plugging into these tables comes from running "compute t coord". Something I didn't mention earlier is that compute takes a table and a key as input and tells us what value we ought to plug into the table, in other words, what value we should associate with that key. The idea then, to bring this back to dynamic programming, is that the compute function uses previous values from the table to compute that new value we ought to plug in.

And that's all! To do dynamic programming in Haskell we can build up some kind of table by successively plugging values into cells using a function that looks up prior values from the table. Easy, right?... or is it?

Perhaps you have a similar experience as I did. So I want to share my current progress grappling with this function. When I first read this function, it seemed to make a kind of intuitive sense and I didn't think much more of it. Then I read it closer and did a sort of double-take, wait what?! How can this possibly work? Take a second look at this snippet of code here.

compute t coord

To compute the value at a given cell and thus fill the table, we pass in t, the very table we're trying to trying to create in the first place. If functional programming is about immutability as you point out, how can this business of using values we haven't computed yet possibly work? If you have a little bit of FP under your belt, you might be asking yourself as I did, "is that an error?", shouldn't this be a "fold" instead of a "map"?

The key here is lazy evaluation. The little bit of magic that makes it possible to create an immutable value from bits of itself all comes down to laziness. Being a sort of long-term-yellow-belt Haskeller, I still find the notion of laziness a bit baffling. So I'll have to let somebody else take over here.

In the meantime, I simply tell myself that this is OK. I content myself with visualising the Table as a sort of dot with lots of arrows sticking out of it. Taking fib as an example:

o
|
|--0--> 1
|
|--1--> 1
|
|--2--> 2
|
|--3--> 2
.
.
.

The bits of the table we haven't seen yet are undiscovered territory. When we first walk down the list it's all undiscovered

o
.
.
.

When we want to compute the first value, we don't need to know anything more about the table because i <= 1.

  helper t i =
    if i <= 1
       then i
       else findTable t (i-1) + findTable t (i-2)


o
|
|--0--> 1
.
.
.

When we want to compute successive values, we're always only looking back into already discovered parts of the table (dynamic programming, hey-hey!). The key thing to remember is that we're 100% working with immutable values here, no fancy tricks beside laziness. "t" really means the table, and not "the table in its current state at iteration 42". It's just that we only discover the bits of table that tell us what the value that corresponds to 42 is when we actually ask for it.

Hopefully with others on StackOverflow, you'll go further than me and not be left mumbling vaguely "uhm yeah, laziness something or another" It's really not a big deal :-)

score 9 · Answer 3 · answered Feb 13 '11 at 12:38

If you want to use DP with 2 or 3 parameters (for example, when processing strings) you can use immutable array:

import Data.Array.IArray

answer :: String -> Int
answer s = table ! (1, l)
  where
    l = length s

    --signatyres are needed, because GHC doesn't know what kind of Array we need
    --string is stored in Array because we need quick access to individual chars
    a :: Array Int Char
    a = listArray (1, l) s

    table :: Array (Int, Int) Int
    table = listArray ((1, 1), (l, l)) [f i j | i <- [1..l], j <- [1..l]]

    f i j |    i    >     j    = 0
          |    i    ==    j    = 1
          | (a ! i) == (a ! j) = 2 + table ! (i+1, j-1)
          | otherwise          = maximum [table ! (i+1, j), table ! (i, j-1)]

This code solves the following task: given a string S, find the subsequence of S of maximum length, which would be a palyndrome (subsequence doesn't need to be continuous).

Basically, 'f' is the resursive function, and array 'table' is a matrix of all its possible values. Because Haskell is lazy, only needed for the answer values of 'f' are computed. In other words, this is recursion with memoization. So use Data.Memocombinators, which is just the same, but already written by somebody else :)

score 7 · Answer 4 · answered Feb 12 '11 at 07:52

7

Dynamic programming in haskell can be expressed elegantly thanks to laziness, see the first example on this page

answered Feb 12 '11 at 07:52

adamax

3,835
2
19
21

Nice example. Would you please be able to explain the meaning of the `!` operator on that page? Is it some sort of array index operator? I am unfamiliar with it. – vikingsteve Jan 05 '14 at 21:39
http://hackage.haskell.org/package/array-0.5.0.0/docs/Data-Array.html It's the "element at this index" operator for arrays. – itsbruce Jan 07 '14 at 13:45
2

Whilst this may theoretically answer the question, [it would be preferable](//meta.stackoverflow.com/q/8259) to include the essential parts of the answer here, and provide the link for reference. – Tobi Nary Mar 06 '16 at 18:14
https://jelv.is/blog/Lazy-Dynamic-Programming/, why not including this link directly then. – windmaomao Dec 13 '21 at 23:35

ulidtko · Answer 5 · 2011-02-12T08:11:21.680

2

Dynamic programming algorithms usually exploit the idea of reducing a problem to simpler problem(s). Its problems can be formulated as some basic fact (say, the shortest path from a square cell to itself has length 0) plus a set of recurrent rules which show exactly how to reduce problem "find shortest path from cell (i,j) to (0,0)" to problem "find shortest paths from cells (i-1,j), (i,j-1) to (0,0); select the best". AFAIK this can be easily expressed in functional style program; no state involved.

edited Feb 12 '11 at 08:11

answered Feb 12 '11 at 07:45

ulidtko

14,740
10
56
88

3

Dynamic programming does divide the problem into sub-problems. However, dynamic programming is built on the idea of overlapping sub-problems. This logic doesn't work for things like finding the distance between two strings. – Vanwaril Feb 12 '11 at 07:50
3

I suspect the original question is asking how you memoize the intermediate results; failure to do so may cause an (otherwise) polynomial DP algorithm to take exponential time. – davidg Feb 12 '11 at 07:51
@davidg, oh. I heard, Haskell has lazy evaluation; doesn't it have opportunities to cache intermediate results too? – ulidtko Feb 12 '11 at 07:53
@ulidtko: Not as far as I am aware. A naive O(2^n) Fibonacci number generator I just implemented using GHC is taking forever to run `fib 40`. – davidg Feb 12 '11 at 07:58
5

I don't know that there's any reason Haskell *couldn't* memoize functions (i.e. cache intermediate results), but no implementation does. Doing so automatically is hard because it's not easy for the runtime system to know which values are worth cacheing and for how long. – Gabe Feb 12 '11 at 08:12
3

The classic example of the subtlety of this problem is this: `sum [1..10^9] / length [1..10^9]`. If the list is not shared, this program will run in seconds. If it is shared, it will probably run out of memory before completing. – luqui Feb 12 '11 at 09:45
Then maybe some explicit programmer-controllable caching attributes? I.e. a way to say "memoize return values of this function". Not possible? – ulidtko Feb 12 '11 at 14:58
1

@ulidtko See luqui's answer using `Data.MemoCombinators` – Dan Burton Feb 12 '11 at 18:12

score 0 · Answer 6 · answered Dec 14 '21 at 00:25

By going over the answers, i felt a bit strange if we're talking about recursion + caching or simply dynamic programming (DP).

Because if it's just DP, the following code does exactly that, https://jelv.is/blog/Lazy-Dynamic-Programming/

basic a b = d m n
  where (m, n) = (length a, length b)
        d i 0 = i
        d 0 j = j
        d i j
          | a !! (i - 1) ==  b !! (j - 1) = ds ! (i - 1, j - 1)
          | otherwise = minimum [ ds ! (i - 1, j)     + 1
                                , ds ! (i, j - 1)     + 1
                                , ds ! (i - 1, j - 1) + 1
                                ]

        ds = Array.listArray bounds
               [d i j | (i, j) <- Array.range bounds]
        bounds = ((0, 0), (m, n))

And this DP version isn't too different from other languages, because if i tried it in Javascript, it'll be a bit verbose, but writes in a similar way.

function levenshtein(str1, str2) {
    const m = str1.length + 1
    const n = str2.length + 1
    const mat = new Array(m).fill(0).map(() => 
        new Array(n).fill(0)
    )
    
    for (let i = 0; i < m; i++) {
        mat[i][0] = i
    }
    for (let j = 0; j < n; j++) {
        mat[0][j] = j
    }
    
    for (let i = 1; i < m; i++) {
        const ic = str1[i-1]
        for (let j = 1; j < n; j++) {
            const jc = str2[j-1]
            if (ic == jc) {
                mat[i][j] = mat[i-1][j-1]
            } else {
                mat[i][j] = Math.min(
                    mat[i-1][j],
                    mat[i][j-1],
                    mat[i-1][j-1]
                ) + 1
            }
        }
    }

    return mat[m-1][n-1]
}

So I wonder if the question is about using recursion + caching ?

How are Dynamic Programming algorithms implemented in idiomatic Haskell?

6 Answers6

Linked

Related