Finite State Transducers in Haskell?

Question

I've been wondering if there is a way to define and work with finite state transducers in Haskell in an idiomatic way.

You can approach FSTs as generators (it generates an output of type {x1,x2}), or as recognizers (given an input of type {x1,x2} it recognizes it if it belongs to the rational relation), or as translators (given an input tape, it translates it into an output tape). Would the representation change depending on the approach?

Would it also be possible to model a FST in a way that you can produce one by specifying rewriting rules? E.g creating a DSL to model rewriting rules, and then creating a function createFST :: [Rule] -> FST.

The closest I could find is Kmett, Bjarnason and Cough's machines library: https://hackage.haskell.org/package/machines

But I can't seem to realize how to model a FST with a Machine. I'd suppose that the correct way of doing it would be similar to how they define Moore and Mealy machines: define a FST as a different entity, but provide an instance of Automaton to be able to use it as a machine.

I found some other options too, but they define it in a straightforward way (like in https://hackage.haskell.org/package/fst ). That doesn't convince me much, as I wonder if there's a better way to do so idiomatically using the strengths of Haskell's type system (like how Moore and Mealy machines are defined in the machines library).

This question needs editing to be on-topic for StackOverflow. Explicitly asking for a library or resource recommendation is off-topic, but describing a specific problem and soliciting solutions (which may very well involve libraries) is acceptable. Asking the *best* way to do something is usually too broad or opinion-based. Asking for *a* way to do something still results in high quality answers, but with less opinionated discussion of their merits. — Cirdec, Jan 17 '15 at 08:53

score 28 · Accepted Answer · edited May 23 '17 at 11:47

A Mealy machine alternately reads an a from a stream of inputs a and outputs a b to a stream of outputs. It reads first and then outputs once after each read.

newtype Mealy a b = Mealy { runMealy :: a -> (b, Mealy a b) }

A Moore machine alternately outputs a b to a stream of outputs and reads an input a from a stream of inputs. It starts with an output of b and then reads once after each output.

data Moore a b = Moore b (a -> Moore a b)

An FST either reads from it's input, writes to its output, or stops. It can read as many times in a row as it wants or write as many times in a row as it wants.

data FST a b
    = Read  (a -> FST a b)
    | Write (b,   FST a b)
    | Stop

The equivalent of an FST from machines is Process. It's definition is a little spread out. To simplify the discussion we are going to forget about Process for now and explore it from the inside-out.

The base functor

To describe what a Process is, we're going to first notice a pattern in all three machines so far. Each of them recursively refers to itself for "what to do next". We are going to replace "what to do next" with any type next. The Mealy machine, while mapping an input to an output, also provides the next machine to run.

newtype MealyF a b next = MealyF { runMealyF :: a -> (b, next) }

The Moore machine, after outputting and requesting an input, figures out the next machine to run.

data MooreF a b next = MooreF b (a -> next)

We can write the FST the same way. When we Read from the input we'll figure out what to do next depending on the input. When we Write to the output we'll also provide what to do next after outputting. When we Stop there's nothing to do next.

data FSTF a b next
    = Read  (a -> next)
    | Write (b,   next)
    | Stop

This pattern of eliminating explicit recursion shows up repeatedly in Haskell code, and is usually called a "base functor". In the machines package the base functor is Step. Compared to our code, Step has renamed the type variable for the output to o, what to do next to r, reading to Await, and writing to Yield.

data Step k o r
  = forall t. Await (t -> r) (k t) r
  | Yield o r
  | Stop

Awaiting is a little more complicated than Read because a Machine can read from multiple sources. For Processes that can only read from a single source, k is Is applied to a specific type, which is a proof the second type Is the first type. For a Process reading inputs a, k will be Is a.

data Step (Is a) o r
  = forall t. Await (t -> r) (Is a t) r
  | Yield o r
  | Stop

The existential quantification forall t. is an implementation detail for dealing with Sources. After witnessing that a ~ t this becomes.

data Step (Is a) o r
  = forall t ~ a. Await (t -> r) Refl r
  | Yield o r
  | Stop

If we unify t with a and remove the Refl constructor which is always the same, this looks like our FSTF.

data Step (Is a) o r
  = Await (a -> r) r
  | Yield  o    r
  | Stop

The extra r for what to do next in Await is what to do next when there's no more input.

The machine transformer `MachineT`

The machine transformer, MachineT, makes Step look almost like our FST. It says, "A machine operating over some monad m is what to do in that monad to get the next Step. The next thing to do after each step is another MachineT."

newtype MachineT m k o = MachineT { runMachineT :: m (Step k o (MachineT m k o)) }

Overall, specialized for our types, this looks like

newtype MachineT m (Is a) o = 
    MachineT m (
        Await (a -> MachineT m (Is a) o) (MachineT m (Is a) o)
      | Yield  o   (MachineT m (Is a) o)
      | Stop
    )

Machine is a pure MachineT.

type Machine k o = forall m. Monad m => MachineT m k o

Universal quantification over all Monads m is another way of saying a computation doesn't need anything from an underlying Monad. This can be seen by substituting Identity for m.

type Machine k o = 
    MachineT Identity (
        Await (a -> MachineT Identity k o) (MachineT Identity k o)
      | Yield  o   (MachineT Identity k o)
      | Stop
    )

Processes

A Process or ProcessT is a Machine or MachineT that only reads a single type of input a, Is a.

type Process    a b = Machine    (Is a) b
type ProcessT m a b = MachineT m (Is a) b

A Process has the following structure after removing all the intermediate constructors that are always the same. This structure is exactly the same as our FST, except it has an added "what to do next" in the case that there's no more input.

type Process a b =
    Await (a -> Process a b) (Process a b)
  | Yield  b   (Process a b)
  | Stop

The ProcessT variant has an m wrapped around it so that it can act in the monad at each step.

Process models state transducers.

Fantastic `machines` tutorial, thanks. I've been meaning to try to understand that library for quite some time. I wonder if there is a way to attach this discussion to the official documentation somehow. — Daniel Wagner, Jan 17 '15 at 18:16
Awesome. With Moore and Mealy machines in the `machines` library, you can create a machine by unfolding over explicit state using `unfoldMoore` or `unfoldMealy`. These functions model the transition functions that define a Moore/Mealy machine, so you can still use them to construct a machine in the traditional way (defining the states, then defining the transition and output function). Is there a similar function that can be created for FSTs (i.e `Process`)? — gonzaw, Jan 17 '15 at 18:37
@gonzaw That would make a good follow up question. This question was about how to model finite state machines in Haskell, that question would be about how to specifically `unfold` a `Process` using the machines library. The question will be better if you can provide a specific problem, like how to unfold a `Process` from `data State = RepeatNext Word8 | SayAgain Word8 Word8` and `step (RepeatNext n) = Await (x -> SayAgain n x) Stop; step (SayAgain 0 _) = Await (n -> RepeatNext n) Stop; step (SayAgain n x) = Yield x (SayAgain (n-1) x)` — Cirdec, Jan 17 '15 at 19:03
Ok. I'll first try toying around with `Process` to try and see if I can figure it out. If not I'll make a follow up question. I'll mark this as answered then. — gonzaw, Jan 17 '15 at 19:23
I'm not convinced that this models finite state transducers in Haskell, but rather (potentially infinite) state transducers. For instance, finite state transducers can be inverted. How would one invert an FST by this definition? I don't think it's possible. — Nathan BeDell, Oct 06 '22 at 02:14

Finite State Transducers in Haskell?

1 Answers1

The base functor

The machine transformer `MachineT`

Processes

Linked