15

For the sake of example let's define a toy automaton type:

data Automaton =
  Auto
    { success ::
      Automaton
    , failure ::
      Automaton
    }

This structure is designed to by cyclic, we can imagine each Automaton as a state with success and failure transitions to other states. So finite automata must be defined recursively. For example here is the simplest automaton:

sink =
  Auto sink sink

It consists of 1 state that always transitions to itself. If we want we can make more complex automata:

-- Transitions to a sink once it encounters a failure
otto1 =
  Auto otto1 sink

-- Mutually recursive automata
otto2 =
  Auto otto2 otto3

otto3 =
  Auto otto3 otto2

These are nice. But it might be nice to take user input and construct an automaton. For example maybe build one out of a transition matrix. Here's a naive implementation:

fromTransition :: [(Int, Int)] -> Automaton
fromTransition tMatrix =
  go 0
  where
    go n =
      let
        (succ, fail) =
          tMatrix !! n
      in
        Auto (go succ) (go fail)

However when we try to do this there starts to be an issue. Our previous examples which were O(1) to follow a transition. However automata produced by this are O(n) to follow a transition, since barring caching, a list must be indexed every time we make a transition. In addition, the input list must be kept in memory as long as this automaton is. Making this basically just worse than using the transition matrix as the automaton.

What I'd really like is automata built dynamically with the method to be just as performant as ones built statically like the ones shown earlier. I'd like some way to analyze the input, construct an automaton and then free the input up up.

In a language with mutation this is easy to do because we can create the structure bit by bit, leaving holes behind to correct later.

I'd also really like to not drag the IO in because once introduced it can't be contained.

Is there a nice way to allocate cyclic structures dynamically like I want?

Wheat Wizard
  • 3,982
  • 14
  • 34
  • 2
    Pedantic note: as-is, your `Automaton` type is not very useful since there is no way to distinguish one value from another. So, `fromTransition _ = let a = Auto a a in a` would _technically_ be a correct answer. Of course, adding some field to `Automaton`, say an `Int`, makes the problem no longer trivial. – chi Dec 24 '21 at 10:32

1 Answers1

16

Laziness to the rescue. We can recursively define a list of all the sub-automata, such that their transitions index into that same list:

fromTransition :: [(Int, Int)] -> Automaton
fromTransition m = a !! 0 where
  a = map (\(succ,fail) -> Auto (a !! succ) (a !! fail)) m

After all the transitions have been traversed at least once, the resulting automaton will be the cyclic graph you expect, without any reference to the matrix (and in particular, transitions will be taken in constant time).

We can also force the automaton ahead of time using seq.

fromTransition :: [(Int, Int)] -> Automaton
fromTransition m = forced `seq` (a !! 0) where
  a = map (\(succ,fail) -> Auto (a !! succ) (a !! fail)) m
  forced = foldr (\(Auto x y) r -> x `seq` y `seq` r) () a
Li-yao Xia
  • 31,896
  • 2
  • 33
  • 56
  • 3
    You can reduce building time by converting the transition matrix to `IntMap (Int, Int)` using `fromDistinctAscList (zip [0..] m)` and then just `fmap`ping over it to make `a`. Now lookups are logarithmic time and fast. – dfeuer Dec 24 '21 at 01:13
  • 2
    For that matter, you could do the same with an array! Anything but a list. – dfeuer Dec 24 '21 at 01:34
  • 2
    I believe this is sometimes called “tying the knot”; that is a good term to search if one wants more information about this method. – jlwoodwa Dec 24 '21 at 11:16
  • shouldn't `fromTransition ((0,0):undefined)` work though? also `... ((0,1),(1,0):undefined)` ; a.o.t. `... ((0,1),(1,2):undefined)`. – Will Ness Dec 24 '21 at 14:25
  • I don't think that should necessarily work, but if you wanted to you would have to find a smarter way to force only the elements you need. – Li-yao Xia Dec 24 '21 at 15:19
  • I couldn't find it in the five to ten minutes I spent on it. hoped you would... :) what I had was `fromTransition m = aforce (a !! 0) where { a = map (\(s,f) -> Auto (a !! s) (a !! f)) m }` but then `aforce`, couldn't come up with the proper way to define it that would work in all the needed cases. `((0,0):undefined)` worked, but the others would cause a stack overflow or with other definitions they would succeed where they shouldn't, like with `((0,1),(1,2):undefined)`. – Will Ness Dec 24 '21 at 15:31
  • 1
    It's going to be a DFS, and the logic to avoid cycles and duplicate work can be a bit tricky. – Li-yao Xia Dec 24 '21 at 19:18
  • I was hoping for some trick from the libraries that would get us the normal form by itself, like `deepseq`. – Will Ness Dec 25 '21 at 06:33