13

Let's start with a definition: A transducer is a function that takes a reducer function and returns a reducer function.

A reducer is a binary function that takes an accumulator and a value and returns an accumulator. A reducer can be executed with a reduce function (note: all function are curried but I've cat out this as well as definitions for pipe and compose for the sake of readability - you can see them in live demo):

const reduce = (reducer, init, data) => {
  let result = init;
  for (const item of data) {
    result = reducer(result, item);
  }
  return result;
}

With reduce we can implement map and filter functions:

const mapReducer = xf => (acc, item) => [...acc, xf(item)];
const map = (xf, arr) => reduce(mapReducer(xf), [], arr);

const filterReducer = predicate => (acc, item) => predicate(item) ?
  [...acc, item] :
  acc;
const filter = (predicate, arr) => reduce(filterReducer(predicate), [], arr);

As we can see there're a few similarities between map and filter and both of those functions work only with arrays. Another disadvantage is that when we compose those two functions, in each step a temporary array is created that gets passed to another function.

const even = n => n % 2 === 0;
const double = n => n * 2;

const doubleEven = pipe(filter(even), map(double));

doubleEven([1,2,3,4,5]);
// first we get [2, 4] from filter
// then final result: [4, 8]

Transducers help us solve that concerns: when we use a transducer there are no temporary arrays created and we can generalize our functions to work not only with arrays. Transducers need a transduce function to work Transducers are generally executed by passing to transduce function:

const transduce = (xform, iterator, init, data) =>
  reduce(xform(iterator), init, data);

const mapping = (xf, reducer) => (acc, item) => reducer(acc, xf(item));

const filtering = (predicate, reducer) => (acc, item) => predicate(item) ?
  reducer(acc, item) :
  acc;

const arrReducer = (acc, item) => [...acc, item];

const transformer = compose(filtering(even), mapping(double));

const performantDoubleEven = transduce(transformer, arrReducer, [])

performantDoubleEven([1, 2, 3, 4, 5]); // -> [4, 8] with no temporary arrays created

We can even define array map and filter using transducer because it's so composable:

const map = (xf, data) => transduce(mapping(xf), arrReducer, [], data);

const filter = (predicate, data) => transduce(filtering(predicate), arrReducer, [], data);

live version if you'd like to run the code -> https://runkit.com/marzelin/transducers

Does my reasoning makes sense?

marzelin
  • 10,790
  • 2
  • 30
  • 49
  • 2
    I didn't read your code, but the description is correct, and the output looks correct. Just remember that there are two ways to perform a series of transformations on a list: you can apply the first transform to every element, then the second, etc. Or you can compose the transformations, apply the combined transform to the first element, then the second, etc. Transducers are the second thing. They're even cooler than that, nothing about them says anything about the nature of the collection, so you can use them on Observables, generators, etc. – Jared Smith Sep 11 '18 at 11:04
  • 1
    I [implemented transducers](https://github.com/jasmith79/transduction) in JavaScript a while back for fun, you can check the repo and compare notes if you like. – Jared Smith Sep 11 '18 at 11:10
  • 1
    This is good. Didn't know transducers was a thing before this. – rmn Sep 11 '18 at 11:12
  • 1
    @rmn they were developed for the Clojure language, but you can implement them in just about any language with higher-order functions. See [this for details](https://www.youtube.com/watch?v=6mTbuzafcII) – Jared Smith Sep 11 '18 at 11:13
  • 1
    Yes, your understanding seems to be fine and match the common expectations. Where did you learn from? Notice that for js specifically, there's also an [interoperability protocol](https://github.com/cognitect-labs/transducers-js#the-transducer-protocol). – Bergi Sep 11 '18 at 12:21
  • 1
    "*Transducers need a `transduce` function to work*" - not really, `transduce` is so trivial that your code doesn't get any longer or more complicated when it's spelled out with `reduce`. Also `pipe` and `compose` (especially in their variadic version) are not really necessary for the explanation, imo defining `pipe` using reducers is even a distraction. – Bergi Sep 11 '18 at 12:25
  • 1
    Btw in your final `_filter` implementation you forgot to pass the `arrReducer` argument – Bergi Sep 11 '18 at 12:25
  • @Bergi thanks for constructive and thorough feedback. I've sptted `transduce` function at `ramda` lib yesterday and it intrigued me so I used the power of internets to figure out what's it all about :) – marzelin Sep 11 '18 at 12:38
  • @Bergi This interoperability protocol is used by `ramda` and gave me pretty hard time when I couldn't understand why my custom functions didn't work with `R.transduce`. I had to dig deep in their implementation I found that it uses some magic symbols. Thanks for the link. – marzelin Sep 11 '18 at 12:48
  • 1
    Yeah when I wrote my implementation I basically just paused my way through the talk while implementing the described semantics, I deliberately ignored the '@@' keys to keep the concepts upfront in my head. – Jared Smith Sep 11 '18 at 13:54
  • @Bergi stateful transducers like `partitionBy(count)` that keep internal state do need a `transduce` function that calls the single arity version of the step function which returns the end result (with state as keeping track of what is in each bucket for `partitionBy`) at the end. See this explanation: https://youtu.be/TaazvSJvBaw. Or is there a way around it and also make stateful transducers work using reduce? – user2609980 Aug 25 '22 at 21:51
  • 1
    @ErwinRooijakkers The "*single arity version of the step function*" is really just a separate function. You can write `myEnd(reduce(myTransducer(step), init, data)` or `reduce(myTransducer(step), init, data).end()` if you prefer. Personally I don't like "stateful transducers", they distract from the pure concept imo. Btw iirc correctly, there are even some uses cases that require passing the initial value, so you write `reduce(myTransducer({step, init}))(data)` with `reduce = ({step, init, end=id}) => iter => { let acc=init; for (const el of iter) acc=step(acc, el); return end(acc); }` – Bergi Aug 26 '22 at 00:55
  • @Bergi indeed, you could extract the finalization step, thank you. I do agree with Clojure's choice to encapsulate the finalization step in the transducer itself (as 1-arity), and the calling of it in `transduce` or it's various counter parts with slightly different behaviour (`sequence` to return a lazy seq, `(into [])` for specific type (now vector) and `eduction` to create a lazy seq that does not store realized parts). – user2609980 Aug 26 '22 at 07:56
  • @Bergi in the case of Clojure's stateful transducer `partition-by` the step function cannot be applied to reduce, because the 1-arity function needs state that closes over the 2-arity function. See https://github.com/clojure/clojure/blob/clojure-1.10.1/src/clj/clojure/core.clj#L7172. – user2609980 Aug 26 '22 at 15:05
  • Or am I misunderstanding and is there a way to write a `partitionBy` (go over a collection and partition when the passed function returns another value than the value when applied to the element before) where the `myEnd` outside of `reduce` works? And "stateful transducers" is a misnomer, since it's only the reducing/step function that is stateful, not the transducer itself. So since state is encapsulated in the reducing function, the transducer itself is still pure. – user2609980 Aug 26 '22 at 18:03
  • @Bergi see Rich Hickey's comment on in [Inside Transducers (2014)](https://youtu.be/4KqUvG8HPYo?t=1617): "It's your job as a transducible context to apply the transducer to your step function. That should never be a user space activity. That transduced step function is now potentially a stateful thing, which means you do not want more than one person to have that. You should never hand a transduced reducing function around, which means all transducible contexts should accept a transducer and, inside them, you call the transducer on the step function and make the transformed step function." – user2609980 Aug 26 '22 at 19:30
  • 1
    @ErwinRooijakkers I'd write `partitionBy = (f) => ({step, init, end=id}) => ({ step(acc, input) { const val = f(input); if (acc.prev === val) return {...acc, tmp: [...acc.tmp, input]}; else return {prev: val, tmp: [], result: step(acc.result, acc.tmp)}; }, init: {prev: undefined, tmp: [], result: init}, end(acc) { if (acc.tmp.length) return end(step(acc.result, acc.tmp)) else return end(acc.result); });` using only pure functions and a triple to represent the partitioning state. No mutable state required! – Bergi Aug 26 '22 at 21:48
  • 1
    @ErwinRooijakkers Things like `map` and `filter` are much more simple, as they just pass through `init` and `end` and don't change the type of the accumulator: `map = (f) => ({step, init, end}) => ({step(acc, input) { return step(acc, f(input)); }, init, end })`; `filter = (f) => ({step, init, end}) =>({step(acc, input) { if (f(input)) return step(acc, input); else return acc; }, init, end})` – Bergi Aug 26 '22 at 21:58
  • Thank you @Bergi. `(def xf (partition-by true?)) (def f (xf conj)) (f (reduce f [] v))` also works with using `reduce` instead of `transduce`. Let me see if I can translate your stateless`partitionBy` to Clojure. – user2609980 Aug 27 '22 at 10:59
  • @Bergi it seems you are using the `acc` to store state in `acc.tmp`. If you transduce over a composition of `partitionBy` and `partitionBy`, does it not fail, because the `acc` that is passed at the end already has `acc.tmp`? – user2609980 Sep 01 '22 at 15:32
  • @user2609980 It should not fail, the inner `step` and `end` functions are always called with their own accumulator type. Notice however that if you compose two partitioning transducers, the inner one will receive arrays for the input. If you allow me Haskell type syntax to express, it's `partitionBy :: Eq b => (a -> b) -> Reducer [a] c -> Reducer a c` – Bergi Sep 01 '22 at 21:13

1 Answers1

4

Your understanding is correct but incomplete.

In addition to the concepts you've described, transducers can do the following:

  • Support a early exit semantic
  • Support a completion semantic
  • Be stateful
  • Support an init value for the step function.

So for instance, an implementation in JavaScript would need to do this:

// Ensure reduce preserves early termination
let called = 0;
let updatesCalled = map(a => { called += 1; return a; });
let hasTwo = reduce(compose(take(2), updatesCalled)(append), [1,2,3]).toString();
console.assert(hasTwo === '1,2', hasTwo);
console.assert(called === 2, called);

Here because of the call to take the reducing operation bails early.

It needs to be able to (optionally) call the step function with no arguments for an initial value:

// handles lack of initial value
let mapDouble = map(n => n * 2);
console.assert(reduce(mapDouble(sum), [1,2]) === 6);

Here a call to sum with no arguments returns the additive identity (zero) to seed the reduction.

In order to accomplish this, here's a helper function:

const addArities = (defaultValue, reducer) => (...args) => {
  switch (args.length) {
    case 0: return typeof defaultValue === 'function' ? defaultValue() : defaultValue;
    case 1: return args[0];
    default: return reducer(...args);
  }
};

This takes an initial value (or a function that can provide one) and a reducer to seed for:

const sum = addArities(0, (a, b) => a + b);

Now sum has the proper semantics, and it's also how append in the first example is defined. For a stateful transducer, look at take (including helper functions):

// Denotes early completion
class _Wrapped {
  constructor (val) { this[DONE] = val }
};

const isReduced = a => a instanceof _Wrapped;
// ensures reduced for bubbling
const reduced = a => a instanceof _Wrapped ? a : new _Wrapped(a);
const unWrap = a => isReduced(a) ? a[DONE] : a;

const enforceArgumentContract = f => (xform, reducer, accum, input, state) => {
  // initialization
  if (!exists(input)) return reducer();
  // Early termination, bubble
  if (isReduced(accum)) return accum;
  return f(xform, reducer, accum, input, state);
};

/*
 * factory
 *
 * Helper for creating transducers.
 *
 * Takes a step process, intial state and returns a function that takes a
 * transforming function which returns a transducer takes a reducing function,
 * optional collection, optional initial value. If collection is not passed
 * returns a modified reducing function, otherwise reduces the collection.
 */
const factory = (process, initState) => xform => (reducer, coll, initValue) => {
  let state = {};
  state.value = typeof initState === 'function' ? initState() : initState;
  let step = enforceArgumentContract(process);
  let trans = (accum, input) => step(xform, reducer, accum, input, state);
  if (coll === undefined) {
    return trans; // return transducer
  } else if (typeof coll[Symbol.iterator] === 'function') {
    return unWrap(reduce(...[trans, coll, initValue].filter(exists))); 
  } else {
    throw NON_ITER;
  }
};

const take = factory((n, reducer, accum, input, state) => {
  if (state.value >= n) {
    return reduced(accum);
  } else {
    state.value += 1;
  }
  return reducer(accum, input);
}, () => 0);

If you want to see all of this in action I made a little library a while back. Although I ignored the interop protocol from Cognitect (I just wanted to get the concepts) I did try to implement the semantics as accurately as possible based on Rich Hickey's talks from Strange Loop and Conj.

Jared Smith
  • 19,721
  • 5
  • 45
  • 83
  • To be honest, these are mostly hacks and not very clean. Especially providing an init value and adding state to it is easily messed up (IIRC the cognitect protocol had a fundamental problem there, not sure how much your implementation deviates). I can't recommend doing the distinction between the functionalities based on the arguments length - overloading methods like that seems fine in clojure but it is not in JavaScript (especially, functional JavaScript). – Bergi Sep 11 '18 at 15:59
  • 1
    @Bergi I had no intention for this to be production-ready (I didn't even add a testing framework, just used `console.assert`). I just naively implemented it for fun, transcribing as I watched the talk. If I wanted to actually use transducers, I'd use e.g. Ramda. I also agree that arity overloading isn't very JavaScript-y, but again, naively transcribed. OP was asking if (s)he understood the *concepts* behind transducers, and I've tried to outline some of the ones that got missed. – Jared Smith Sep 11 '18 at 16:07