3

This is a function which deep-flattens an array

const deepFlatten = (input) => {
  let result = [];
  input.forEach((val, index) => {
    if (Array.isArray(val)) {
      result.push(...deepFlatten(val));
    } else {
      result.push(val);
    }
  });
  return result;
};

During a discussion, I was told it is not memory efficient, as it might cause stack overflows.

I read in http://2ality.com/2015/06/tail-call-optimization.html that I could potentially re-write it so that it is TCO-ed.

How would that look like and how could I measure it's memory usage profile?

George Katsanos
  • 13,524
  • 16
  • 62
  • 98
  • not sure just suggesting `var deepFlatten(input) => input.reduce((a,b)=>!Array.isArray(b)?a.concat(b):a.concat(deepFlatten(b)), [])` – MotKohn Sep 26 '17 at 20:13
  • To measure memory usage I would just put a conditional breakpoint. e.g. `deepFlatten([1,[2,[3,[4,5]]]])` then put a breakpoint if val === 5 then see call stack how deep it is – MotKohn Sep 26 '17 at 20:22
  • if you elaborate a bit and add it as an answer it'll definitely get at least upvoted! – George Katsanos Sep 26 '17 at 20:28
  • No. You should talk to your discussion partner again. This does not cause a stack overflow because of its recursiveness. – Bergi Sep 26 '17 at 20:56
  • @Bergi why does it cause a stack overflow then? – George Katsanos Sep 26 '17 at 21:02
  • @GeorgeKatsanos I'm pretty sure your advisor was referring to the spread operator, which can overflow the stack if you pass (really) many arguments to a function. To traverse the nesting of arrays, recursion is totally fine (assuming your edge case is not a thousands-of-levels nesting, in which case you might want to manage a stack manually, but it doesn't get any more efficient by that - it just allocates more memory without breaking). – Bergi Sep 26 '17 at 21:05
  • I think he said that this will cause a stack overflow for some thousand arrays.. - I mentioned the TCO opt but he said there's an imperative do-while solution that could avoid those issues – George Katsanos Sep 26 '17 at 21:10

3 Answers3

4

tail calls in general

I've shared another functional approach to flattening arrays in JavaScript; I think that answer shows a better way to solve this particular problem, but not all functions can be decomposed so nicely. This answer will focus on tail calls in recursive functions, and tail calls in general

In general, to move the recurring call into tail position, an auxiliary function (aux below) is created where the parameters of the function holds all the necessary state to complete that step of the computation

const flattenDeep = arr =>
  {
    const aux = (acc, [x,...xs]) =>
      x === undefined
        ? acc
        : Array.isArray (x)
          ? aux (acc, x.concat (xs))
          : aux (acc.concat (x), xs)
    return aux ([], arr)
  }
        
const data = 
  [0, [1, [2, 3, 4], 5, 6], [7, 8, [9]]]
  
console.log (flattenDeep (data))
// [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ]

js doesn't really have tail call elimination

However, most JavaScript implementations still don't support tail calls - you'll have to approach this differently if you want to use recursion in your program and not worry about blowing the stack - this is also something I've already written a lot about, too

My current go-to is the clojure-style loop/recur pair because it gives you stack safety while simultaneously affording your program to be written using a beautiful, pure expression

const recur = (...values) =>
  ({ type: recur, values })
  
const loop = f =>
  {
    let acc = f ()
    while (acc && acc.type === recur)
      acc = f (...acc.values)
    return acc
  }
  
const flattenDeep = arr =>
  loop ((acc = [], [x,...xs] = arr) =>
    x === undefined
      ? acc
      : Array.isArray (x)
        ? recur (acc, x.concat (xs))
        : recur (acc.concat (x), xs))
        
let data = []
for (let i = 2e4; i>0; i--)
  data = [i, data]

// data is nested 20,000 levels deep
// data = [1, [2, [3, [4, ... [20000, []]]]]] ...

// stack-safe !
console.log (flattenDeep (data))
// [ 1, 2, 3, 4, ... 20000 ]

an important position

why is tail position so important anyway? well have you ever thought about that return keyword? That's the way out of your function; and in a strictly-evaluated language like JavaScript, return <expr> means everything in expr needs to be computed before we can send the result out.

If expr contains a sub-expression with function calls that are not in tail position, those calls will introduce a new frame, compute an intermediate value, and then return it to the calling frame for the tail call – which is why the stack can overflow if there's no way to identify when it's safe to remove a stack frame

Anyway, it's hard to talk about programming, so hopefully this small sketch helps identify calling positions in some common functions

const add = (x,y) =>
  // + is in tail position
  x + y

const sq = x =>
  // * is in tail position
  x * x
  
const sqrt = x =>
  // Math.sqrt is in tail position
  Math.sqrt (x)

const pythag = (a,b) =>
  // sqrt is in tail position
  // sq(a) and sq(b) must *return* to compute add
  // add must *return* to compute sqrt
  sqrt (add (sq (a), sq (b)))

// console.log displays the correct value becaust pythag *returns* it
console.log (pythag (3,4)) // 5

Stick with me here for a minute – now imagine there was no return values – since a function has no way to send a value back to the caller, of course we could easily reason that all frames can be immediately discarded after the function has been evaluated

// instead of
const add = (x,y) =>
  { return x + y }

// no return value
const add = (x,y) =>
  { x + y }

// but then how do we get the computed result?
add (1,2) // => undefined

continuation passing style

Enter Continuation Passing Style – by adding a continuation parameter to each function, it's as if we invent our very own return mechanism

Don't get overwhelmed by the examples below – most people have already seen continuation passing style in the form of these misunderstood things called callbacks

// jQuery "callback"
$('a').click (event => console.log ('click event', event))

// node.js style "callback"
fs.readFile ('entries.txt', (err, text) =>
  err
    ? console.error (err)
    : console.log (text))

So that's how you work with the computed result – you pass it to a continuation

// add one parameter, k, to each function
// k makes *return* into a normal function
// note {}'s are used to suppress the implicit return value of JS arrow functions
const add = (x,y,k) =>
  { k (x + y) }

const sq = (x,k) =>
  { k (x * x) }
  
const sqrt = (x,k) =>
  { k (Math.sqrt (x)) }

const pythag = (a,b,k) =>
  // sq(a) is computed, $a is the result
  sq (a, $a => {
    // sq(b) is computed, $b is the result
    sq (b, $b => {
      // add($a,$b) is computed, $sum is the result
      add ($a, $b, $sum => {
        // sqrt ($sum) is computed, conintuation k is passed thru
        sqrt ($sum, k) }) }) })
  
// here the final continuation is to log the result
// no *return* value was used !
// no reason to keep frames in the stack !
pythag (3, 4, $c => { console.log ('pythag', $c) })

how to get a value out?

This famous question: How do I return the response from an asynchronous call? has baffled millions of programmers – only, it really has nothing to do with "an asynchronous call" and everything to do with continuations and whether those continuations return anything

  // nothing can save us...
  // unless pythag *returns*
  var result = pythag (3,4, ...)
  console.log (result) // undefined

Without a return value, you must use a continuation to move the value to the next step in the computation – this can't be the first way I've tried to say that ^^

but everything is in tail position !

I know it might be hard to tell just by looking at it, but every function has exactly one function call in tail position – if we restore the return functionality in our functions, the value of call 1 is the value of call 2 is the value of call 3, etc – there's no need to introduce a new stack frame for subsequent calls in this situation – instead, call 1's frame can be re-used for call 2, and then re-used again for call 3; and we still get to keep the return value !

// restore *return* behaviour
const add = (x,y,k) =>
  k (x + y)

const sq = (x,k) =>
  k (x * x)
  
const sqrt = (x,k) =>
  k (Math.sqrt (x))

const pythag = (a,b,k) =>
  sq (a, $a =>
    sq (b, $b =>
      add ($a, $b, $sum =>
        sqrt ($sum, k))))
  
// notice the continuation returns a value now: $c
// in an environment that optimises tail calls, this would only use 1 frame to compute pythag
const result =
  pythag (3, 4, $c => { console.log ('pythag', $c); return $c })
  
// sadly, the environment you're running this in likely took almost a dozen
// but hey, it works !
console.log (result) // 5

tail calls in general; again

this conversion of a "normal" function to a continuation passing style function can be a mechanical process and done automatically – but what's the real point of putting everything into tail position?

Well if we know that frame 1's value is the value of frame 2's, which is the value of frame 3's, and so on, we can collapse the stack frames manually use a while loop where the computed result is updated in-place during each iteration – a function utilising this technique is called a trampoline

Of course trampolines are most often talked about when writing recursive functions because a recursive function could "bounce" (spawn another function call) many times; or even indefinitely – but that doesn't mean we can't demonstrate a trampoline on our pythag function that would only spawn a few calls

const add = (x,y,k) =>
  k (x + y)

const sq = (x,k) =>
  k (x * x)
  
const sqrt = (x,k) =>
  k (Math.sqrt (x))

// pythag now returns a "call"
// of course each of them are in tail position ^^
const pythag = (a,b,k) =>
  call (sq, a, $a =>
    call (sq, b, $b =>
      call (add, $a, $b, $sum =>
        call (sqrt, $sum, k))))
  

const call = (f, ...values) =>
  ({ type: call, f, values })
  
const trampoline = acc =>
  {
    // while the return value is a "call"
    while (acc && acc.type === call)
      // update the return value with the value of the next call
      // this is equivalent to "collapsing" a stack frame
      acc = acc.f (...acc.values)
    // return the final value
    return acc
  }

// pythag now returns a type that must be passed to trampoline
// the call to trampoline actually runs the computation
const result =
  trampoline (pythag (3, 4, $c => { console.log ('pythag', $c); return $c }))

// result still works
console.log (result) // 5

why are you showing me all of this?

So even tho our environment doesn't support stack-safe recursion, as long as we keep everything in tail position and use our call helper, we can now convert any stack of calls into a loop

// doesn't matter if we have 4 calls, or 1 million ... 
trampoline (call (... call (... call (...))))

In the first code example, I showed using an auxiliary loop, but I also used a pretty clever (albeit inefficient) loop that didn't require deep recurring into the data structure – sometimes that's not always possible; eg, sometimes your recursive function might spawn 2 or 3 recurring calls – what to do then ?

Below I'm going to show you flatten as a naive, non-tail recursive procedure – what's important to note here is that one branch of the conditional results in two recurring calls to flatten – this tree-like recurring process might seem hard to flatten into an iterative loop at first, but a careful, mechanical conversion to continuation passing style will show this technique can work in almost any (if not all) scenarios

[ DRAFT ]

// naive, stack-UNSAFE
const flatten = ([x,...xs]) =>
  x === undefined
    ? []
    : Array.isArray (x)
      // two recurring calls
      ? flatten (x) .concat (flatten (xs))
      // one recurring call
      : [x] .concat (flatten (xs))

Continuation passing style

// continuation passing style
const flattenk = ([x,...xs], k) =>
  x === undefined
    ? k ([])
    : Array.isArray (x)
      ? flattenk (x, $x =>
          flattenk (xs, $xs =>
            k ($x.concat ($xs))))
      : flattenk (xs, $xs =>
          k ([x].concat ($xs)))

Continuation passing style with trampoline

const call = (f, ...values) =>
  ({ type: call, f, values })
      
const trampoline = acc =>
  {
    while (acc && acc.type === call)
      acc = acc.f (...acc.values)
    return acc
  }

const flattenk = ([x,...xs], k) =>
  x === undefined
    ? call (k, [])
    : Array.isArray (x)
      ? call (flattenk, x, $x =>
          call (flattenk, xs, $xs =>
            call (k, $x.concat ($xs))))
      : call (flattenk, xs, $xs =>
          call (k, ([x].concat ($xs))))

const flatten = xs =>
  trampoline (flattenk (xs, $xs => $xs))

let data = []
for (let i = 2e4; i>0; i--)
  data = [i, data];

console.log (flatten (data))

wups, you accidentally discovered monads

[ DRAFT ]

// yours truly, the continuation monad
const cont = x =>
  k => k (x)

// back to functions with return values
// notice we don't need the additional `k` parameter
// but this time wrap the return value in a continuation, `cont`
// ie, `cont` replaces *return*
const add = (x,y) =>
  cont (x + y)

const sq = x =>
  cont (x * x)
  
const sqrt = x =>
  cont (Math.sqrt (x))

const pythag = (a,b) =>
  // sq(a) is computed, $a is the result
  sq (a) ($a =>
    // sq(b) is computed, $b is the result
    sq (b) ($b =>
      // add($a,$b) is computed, $sum is the result
      add ($a, $b) ($sum =>
        // sqrt ($sum) is computed, a conintuation is returned 
        sqrt ($sum))))
  
// here the continuation just returns whatever it was given
const $c =
  pythag (3, 4) ($c => $c)
  
console.log ($c)
// => 5

delimited continuations

[ DRAFT ]

const identity = x =>
  x

const cont = x =>
  k => k (x)

// reset
const reset = m =>
  k => m (k)

// shift
const shift = f =>
  k => f (x => k (x) (identity))

const concatMap = f => ([x,...xs]) =>
  x === undefined
    ? [ ]
    : f (x) .concat (concatMap (f) (xs))

// because shift returns a continuation, we can specialise it in meaningful ways
const amb = xs =>
  shift (k => cont (concatMap (k) (xs)))

const pythag = (a,b) =>
  Math.sqrt (Math.pow (a, 2) + Math.pow (b, 2))

const pythagTriples = numbers =>
  reset (amb (numbers) ($x =>
          amb (numbers) ($y =>
            amb (numbers) ($z =>
              // if x,y,z are a pythag triple
              pythag ($x, $y) === $z
                // then continue with the triple
                ? cont ([[ $x, $y, $z ]])
                // else continue with nothing
                : cont ([ ])))))
        (identity)
        
console.log (pythagTriples ([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ]))
// [ [ 3, 4, 5 ], [ 4, 3, 5 ], [ 6, 8, 10 ], [ 8, 6, 10 ] ]      
Mulan
  • 129,518
  • 31
  • 228
  • 259
  • you're using destructuring and spread extensively and that for me makes it a bit harder to read but it's definitely interesting.upvoting – George Katsanos Sep 28 '17 at 09:22
  • @GeorgeKatsanos there's a lot to write about this topic and I've meant to put this answer together for too long now – only, I must get to bed ! I will leave the rest of this as a draft and finish it up tomorrow ^^ – Mulan Sep 28 '17 at 10:56
1

You can't optimize it when your recursive call is inside forEach, because in order to apply TCO, the compiler need to check that you not saving a "state" of the previous call. In case of forEach you do save a "state" of the current position.

In order to implement it with TCO you can rewrite that foreach to be implemented with the recursive call, it would look something like that:

function deepFlattenTCO(input) {
  const helper = (first, rest, result) => {
    if (!Array.isArray(first)) {
      result.push(first);
      if (rest.length > 0) {
        return helper(rest, [], result);
      } else {
        return result;
      }
    } else {
      const [newFirst, ...newRest] = first.concat(rest);

      return helper(newFirst, newRest, result);
    }
  };

  return helper(input, [], []);
}

console.log(deepFlattenTCO([
  [1], 2, [3], 4, [5, 6, [7]]
]));

You can see that in each return the only operation that is executed is the recursive call, so, you don't save "state" between recursive calls, therefore the compiler will apply the optimization.

felixmosh
  • 32,615
  • 9
  • 69
  • 88
  • I know it was not in my original question but doesnt this solution create many intermediate arrays? also could you explain what is the rest argument for? – George Katsanos Sep 26 '17 at 20:41
  • Yes, The rest parameter is for saving that "state" of the previous calls... – felixmosh Sep 26 '17 at 20:44
  • 1
    The common practice of converting each type of recursive function into TCO recursive function is by passing the "computed state" to the recursion itself. – felixmosh Sep 26 '17 at 20:45
  • A classic example for that: http://blog.moertel.com/posts/2013-05-11-recursive-to-iterative.html#example-factorial – felixmosh Sep 26 '17 at 20:47
  • do you have any automated tools that would measure the memory usage of a function? (dev tools I guess?) PS: thanks! – George Katsanos Sep 26 '17 at 20:55
  • @GeorgeKatsanos Yes, this solution does create many intermediate arrays, but so does your original function. – Bergi Sep 26 '17 at 20:59
  • what does `const [newFirst, ...newRest] = first.concat(rest);` do? – George Katsanos Sep 26 '17 at 21:43
  • @FelixMosheev check http://exploringjs.com/es6/ch_tail-calls.html#_tail-recursive-loops – George Katsanos Sep 26 '17 at 22:03
  • That line merges the first item (that is an array) with what ever there is in the rest, and then take the first item from the result list, and the rest – felixmosh Sep 27 '17 at 04:35
  • Regarding the recursive ForEach, it by itself will get the optimization, but your deepFlat function still will need to save "state" from previous iterations. – felixmosh Sep 27 '17 at 04:50
1

Recursive functions are elegantly expressed, and tail recursion optimization can even prevent them from blowing the stack.

However, any recursive function can be converted to an uglier iterator based solution, which might be beautiful only in its memory consumption and performance, though not to look at.

See: Iterative solution for flattening n-th nested arrays in Javascript

and perhaps this test of different approaches: https://jsperf.com/iterative-array-flatten/2

Haim Zamir
  • 351
  • 2
  • 5