I think this question is a very good question and I can't disagree with the answers but I have the feeling we are missing the point entirely.
Thinking of map
and reduce
more abstractly can provide us with a LOT of very good insights.
This answer is divided in 3 parts:
- Defining and deciding between map and reduce (7 minutes)
- Using reduce intentionally (8 minutes)
- Bridging map and reduce with transducers (5 minutes)
map or reduce
Common traits
map
and reduce
are implemented in a meaningful and consistent way on a wide range of objects which are not necessarily collections.
They return a value useful to the surrounding algorithm, and they only care about this value.
Their major role is conveying intent regarding transformation or preservation of structure.
Structure
By "structure" I mean a set of conceptual properties which characterise abstract objects, such as an unordered list or a 2D matrix, and their concretion in data structures.
Note that there can be a disconnect between the two:
- an unordered list can be stored as an array, which has the concept of ordering carried by indexed keys;
- a 2D matrix can be stored as a TypedArray, which lacks the concept of dimension (or nesting).
map
map
is a strict structure-preserving transformation.
It is useful to implement it on other kinds of objects to grasp its semantic value:
class A {
constructor (value) {
this.value = value
}
map (f) {
return new A(f(this.value))
}
}
new A(5).map(x => x * 2); // A { value: 10 }
Objects implementing map
can have all kinds of behaviours, but they always return the same kind of object you started with while transforming the values with the supplied callback.
Array.map
returns an array of the same length and the same ordering as the original.
On the callback arity
Because it preserves structure, map
is viewed as a safe operation, but not every callback is equal.
With a unary callback: map(x => f(x))
, each value of the array is totally indifferent to the presence of other values.
Using the other two parameters on the other hand introduces coupling, which may not be true to the original structure.
Imagine removing or reordering the second item in the arrays bellow: doing it before or after the map would not yield the same result.
Coupling with array size:
[6, 3, 12].map((x, _, a) => x/a.length);
// [2, 1, 4]
Coupling with ordering:
['foo', 'bar', 'baz'].map((x, i) => [i, x]);
// [[0, 'foo'], [1, 'bar'], [2, 'baz']]
Coupling with one specific value:
[1, 5, 3].map((x, _, a) => x/Math.max(...a));
//[ 0.2, 1, 0.6]
Coupling with neighbours:
const smooth = (x, i, a) => {
const prev = a[i - 1] ?? x;
const next = a[i + 1] ?? x;
const average = (prev + x + next) / 3;
return Math.round((x + average) / 2);
};
[1, 10, 50, 35, 40, 1].map(smoothh);
// [ 3, 15, 41, 38, 33, 8 ]
I recommend making it explicit on the call site whether or not these parameters are used.
const transfrom = (x, i) => x * i;
❌ array.map(transfrom);
⭕ array.map((x, i) => transfrom(x, i));
This has other benefits when you use variadic functions with map
.
❌ ["1", "2", "3"].map(parseInt);
// [1, NaN, NaN]
⭕ ["1", "2", "3"].map(x => parseInt(x));
// [1, 2, 3]
reduce
reduce
sets a value free from its surrounding structure.
Again, let's implement it on a simpler object:
class A {
constructor (value) {
this.value = value
}
reduce (f, init) {
return init !== undefined
? f(init, this.value)
: this.value
}
}
new A(5).reduce(); // 5
const concat = (a, b) => a.concat(b);
new A(5).reduce(concat, []); // [ 5 ]
Whether you leave the value alone or you put it back into something else, the output of reduce
can be of any shape. It is literally the opposite of map
.
Implications for arrays
Arrays can contain multiple or zero values, which gives rise to two, sometimes conflicting, requirements.
The need to combine
How can we return multiple values with no structure around them?
It is impossible. In order to return only one value, we have two options:
- summarising the values into one value;
- moving the values into a different structure.
Doesn't it make more sense now?
The need to initialise
What if there is no value to return?
If reduce
returned a falsy value, there would be no way to know if the source array was empty or if it contained that falsy value, so unless we provide an initial value, reduce
has to throw.
The true purpose of the reducer
You should be able to guess what the reducer f
does in the following snippet:
[a].reduce(f);
[].reduce(f, a);
Nothing. It is not called.
It is the trivial case: a
is the single value we want to return, so f
is not needed.
This is by the way the reason why we didn't make the reducer mandatory in our class A
earlier: because it contained only one value. It is mandatory on arrays because arrays can contain multiple values.
Since the reducer is only called when you have 2 or more values, saying its sole purpose is to combine them is only a stone throw away.
On transforming values
On arrays of variable lengths, expecting the reducer to transform the values is dangerous because, as we discovered, it may not be called.
I encourage you to map
before you reduce
when you need to both transform values and change shape.
It is a good idea to keep these two concerns separate for readability anyway.
When not to use reduce
Because reduce
is this general-purpose tools for achieving structure transformation, I advise you to avoid it when you want an array back if there exists another more focussed method which does what you want.
Specifically, if you struggle with nested arrays in a map
, think of flatMap
or flat
before reaching for reduce
.
At the heart of reduce
a recursive binary operation
Implementing reduce
on arrays introduces this feedback loop where the reducer's first argument is the return value of the previous iteration.
Needless to say it looks nothing like map
's callback.
We could implement Array.reduce
recursively like so:
const reduce = (f, acc, [current, ...rest]) =>
rest.length == 0
? f(acc, current)
: reduce(f, f(acc, current), rest)
This highlights the binary nature of the reducer f
and how its return value becomes the new acc
in the next iteration.
I let you convince yourself that the following is true:
reduce(f, a, [b, c, d])
// is equivalent to
f(f(f(a, b), c), d)
// or if you squint a little
((a ❋ b) ❋ c) ❋ d
This should seem familiar: you know arithmetic operations obey rules such as "associativity" or "commutativity". What I want to convey here is that the same kind of rules apply.
reduce
may strip out the surrounding structure, values are still bound together in an algebraic structure for the time of the transformation.
the algebra of reducers
Algebraic structures are way out of the scope of this answer, so I will only touch on how they are relevant.
((a ❋ b) ❋ c) ❋ d
Looking at the expression above, it is self-evident that there is a constraint that ties all the values together : ❋
must know how to combine them the same way +
must know how to combine 1 + 2
and just as importantly (1 + 2) + 3
.
Weakest safe structure
One way to ensure this is to enforce that these values belong to a same set on which the reducer is an "internal" or "closed" binary operation, that is to say: combining any two values from this set with the reducer produces a value which belongs to the same set.
In abstract algebra this is called a magma. You can also look up semi-groups which are more talked about and are the same thing with associativity (no braces required), although reduce
doesn't care.
Less safe
Living in a magma is not absolutely necessary : we can imagine a situation where ❋
can combine a
and b
but not c
and b
.
An example of this is function composition. One of the following functions returns a string, which constrains the order in which you can combine them:
const a = x => x * 2;
const b = x => x ** 2;
const c = x => x + ' !';
// (a ∘ b) ∘ c
const abc = x => c(b(a(x)));
abc(5); // "100 !"
// (a ∘ c) ∘ b
const acb = x => b(c(a(x)));
acb(5); // NaN
Like many binary operations, function composition can be used as a reducer.
Knowing if we are in a situation where reordering or removing elements from an array could make reduce
break is kind of valuable.
So, magmas: not absolutely necessary, but very important.
what about the initial value
Say we want to prevent an exception from being thrown when the array is empty, by introducing an initial value:
array.reduce(f, init)
// which is really the same as doing
[init, ...array].reduce(f)
// or
((init ❋ a) ❋ b) ❋ c...
We now have an additional value. No problem.
"No problem"!? We said the purpose of the reducer was to combine the array values, but init
is not a true value: it was forcefully introduced by ourselves, it should not affect the result of reduce
.
The question is:
What init
should we pick so that f(init, a)
or init ❋ a
returns a
?
We want an initial value which acts as though it was not there. We want a neutral element (or "identity").
You can look up unital magmas or monoids (the same with associativity) which are swear words for magmas equipped with a neutral element.
Some neutral elements
You already know a bunch of neutral elements
numbers.reduce((a, b) => a + b, 0)
numbers.reduce((a, b) => a * b, 1)
booleans.reduce((a, b) => a && b, true)
strings.reduce((a, b) => a.concat(b), "")
arrays.reduce((a, b) => a.concat(b), [])
vec2s.reduce(([u,v], [x,y]) => [u+x,v+y], [0,0])
mat2s.reduce(dot, [[1,0],[0,1]])
You can repeat this pattern for many kinds of abstractions. Note that the neutral element and the computation don't need to be this trivial (extreme example).
Neutral element hardships
We have to accept the fact that some reductions are only possible for non-empty arrays and that adding poor initialisers don't fix the problem.
Some examples of reductions gone wrong:
Only partially neutral
numbers.reduce((a, b) => b - a, 0)
// does not work
numbers.reduce((a, b) => a - b, 0)
Subtracting 0
form b
returns b
, but subtracting b
from 0
returns -b
.
We say that only "right-identity" is true.
Not every non-commutative operation lack a symmetrical neutral element but it's a good sign.
Out of range
const min = (a, b) => a < b ? a : b;
// Do you really want to return Infinity?
numbers.reduce(min, Infinity)
Infinity
is the only initial value which does not change the output of reduce
for non-empty arrays, but it is unlikely that we would want it to actually appear in our program.
The neutral element is not some Joker value we add as a convenience. It has to be an allowed value, otherwise it doesn't accomplish anything.
Nonsensical
The reductions bellow rely on position, but adding an initialiser naturally shifts the first element to the second place, which requires messing with the index in the reducer to maintain the behaviour.
const first = (a, b, i) => !i ? b : a;
things.reduce(first, null);
const camelCase = (a, b, i) => a + (
!i ? b : b[0].toUpperCase() + b.slice(1)
);
words.reduce(camelCase, '');
It would have been a lot cleaner to embrace the fact the array can't be empty and simplify the definition of the reducers.
Moreover, the initials values are degenerate:
There is no way to preserve the notion of "firstness" with an initial value.
conclusion
Algebraic structures can help us think of our programs in a more systematic way. Knowing which one we are dealing with can predict exactly what we can expect from reduce
, so I can only advise you to look them up.
One step further
We have seen how map
and reduce
were so different structure-wise, but it is not as though they were two isolated things.
We can express map
in terms of reduce
, because it is always possible to rebuild the same structure we started with.
const map = f => (acc, x) =>
acc.concat(f(x))
;
const double = x => x * 2;
[1, 2, 3].reduce(map(double), []) // [2, 4, 6]
Pushing it a little further has led to neat tricks such as transducers.
I will not go into much detail about them, but I want you to notice a couple of things which will echo what we have said before.
Transducers
First let's see what problem we are trying to solve
[1, 2, 3, 4].filter(x => x % 2 == 0)
.map(x => x ** 2)
.reduce((a, b) => a + b)
// 20
We are iterating 3 times and creating 2 intermediary data structures. This code is declarative, but not efficient. Transducers attempt to reconcile the two.
First a little util for composing functions using reduce
, because we are not going to use method chaining:
const composition = (f, g) => x => f(g(x));
const identity = x => x;
const compose = (...functions) =>
functions.reduce(composition, identity)
;
// compose(a, b, c) is the same as x => a(b(c(x)))
Now pay attention to the implementation of map
and filter
bellow. We are passing in this reducer
function instead of concatenating directly.
const map = f => reducer => (acc, x) =>
reducer(acc, f(x))
;
const filter = f => reducer => (acc, x) =>
f(x) ? reducer(acc, x) : acc
;
look at this more specifically:
reducer => (acc, x) => [...]
after the callback function f
is applied, we are left with a function which takes a reducer as input and returns a reducer.
These symmetrical functions is what we pass to compose
:
const pipeline = compose(
filter(x => x % 2 == 0),
map(x => x ** 2)
);
Remember compose
is implemented with reduce
: our composition
function defined earlier combines our symmetrical functions.
The output of this operation is a function of the same shape: something which expects a reducer and returns a reducer, which means
- we have a magma. We can keep composing transformations as long as they have this shape.
- we can consume this chain by applying the resulting function with a reducer, which will return a reducer that we can use with
reduce
I let you expand the whole thing if you need convincing. If you do so you will notice that transformations will conveniently be applied left to right, which is the opposite direction of compose
.
Alright, lets use this weirdo:
const add = (a, b) => a + b;
const reducer = pipeline(add);
const identity = 0;
[1, 2, 3, 4].reduce(reducer, identity); // 20
We have composed operations as diverse as map
, filter
and reduce
into a single reduce
, iterating only once with no intermediary data-structure.
This is no small achievement! And it is not a scheme you can come up with by deciding between map
and reduce
merely on the basis of the conciseness of the syntax.
Also notice that we have full control over the initial value and the final reducer. We used 0
and add
, but we could have used []
and concat
(more realistically push
performance-wise) or any other data-structure for which we can implement a concat-like operation.