What is the purpose of toolz.thread_first() and toolz.thread_last()?

Question

See toolz.thread_first() and toolz.thread_last().

It seems to me that they make code strictly worse off. Consider

x = f(x)
x = g(x)
x = h(x)

vs.

x = thread_last(x,
                f,
                g,
                h)

The first example is

more readable and easily understood,
not reliant on an external Python library,
easier to debug, as the multiple statements each have their own line, and
more verbose, but not by a significant margin.

Even if you wanted to pass x through, say, a variably-sized list of functions with x = thread_first(x, *funcs), this could just be accomplished with regular iteration--which is, again, more verbose, but it's not like this situation comes up all that often anyway.

Why would anyone want to use thread_first() and thread_last()? It basically seems like very bad style to me. In principle, implementing a way to pipe a list of arguments through functions could result in speedups via parallelization--but it doesn't seem to me as though this actually happens with these implementations.

score 4 · Answer 1 · answered Aug 01 '17 at 14:49

While this is mostly opinion based there is a number of benefits:

Naming things is hard (or so they say) thread_* or pipe allows you to skip intermediate assignments. No need to invent dozens of intermediate names, or even worse living in a hell of x, y, z variables.

Focus on data flow and data structures enables clean declarative style. Large parts of your code can be represented as simple data structures and transformed using standard data structure methods. Arguably it makes your code easier to understand:

thread_first(
  url, 
  requests.get,
  requests.models.Response.json,
  operator.itemgetter("result"))

and compose / reuse code:

request_pipeline = [authorize, fetch, validate]
api_response = [render_json]
html_response = [render_html]

thread_first(request, *request_pipeline + api_response)
thread_first(request, *request_pipeline + html_response)

Shift focus to referential transparency It naturally enforces small, referentially transparent functions and as a side effect (pun intended) makes your code much easier to debug.

It plays very well with lazy code (toolz.map, toolz.filter) which makes it great for data processing on, possibly infinite, data structures.

Finally you have to remember that these functions don't exist alone. They are intended to be used with other parts of toolz (especially function composition and currying), built-in modules (like operator) and play really nice with 3rd party tools (like multipledispatch). Only then they show its full power.

However many ideas implemented in toolz come are far more natural in strictly functional languages (Clojure and Elixir) and as you mentioned, may not feel natural for Python developers.

What is the purpose of toolz.thread_first() and toolz.thread_last()?

1 Answers1