I always thought that the prerequisites for a pointfree function were to get the function arguments to the end of the definition
Well, as other answers have shown, this isn't a necessary condition; almost anything can be made point free. So it's not a prerequisite as such, rather just one simple rule for rewriting a function definition. This particular rule even has a name: eta reduction (as in the Greek letter η, spelled out in English as "eta"). But it's not the only such rule.
The conditions for applying eta reduction are also not quite just "get the arguments to the end". That's a loose description of what it "looks like" when you can apply eta reduction, but the actual condition isn't one you can just apply blindly to the source code text without considering the structure of the expression.1
The real rule is to look for something like:
func x = expr x
By that I do not mean "some text (that I'm calling expr
) followed by x
", but rather "some well-formed expression (that I'm calling expr
) applied to x
". The distinction is that I'm talking about the structure of the expression, not the order of characters on the source code representation of that expression. This is crucial to understanding the differences between the examples that are confusing you.
You always need the "top level expression" of the function body to be exactly something directly applied to the last argument of the function. You also need for that argument not to be referenced anywhere else, but it otherwise doesn't matter what the "something" is; it could be a complicated expression with many sub parts.
Now let's consider your examples in that light.
-- This can be made pointfree quite easily:
let lengths x = map length x
Okay, x
is textually at the end. But is it the argument of a top level application? We've got three parts to this expression, not just two, so our rule doesn't necessarily apply immediately. But remembering that function application is left associative (f x y
means f
is applied to x, and the result of that f x
is applied to y
), this is map length
applied to x
. So we can apply eta reduction to eliminate x
, leaving just the expression that was applied to it: map length
.
-- However this cannot:
let lengthz x = length `map` x
Again x
is textually last, but we clearly can't just cut it out textually, since that doesn't even result in well-formed code. An infix operator needs a left and a right argument (a name written in backticks like `map`
turns it into an infix operator).
But remember what infix operator syntax means. length `map` x
is map
applied to length
, and the result of that map length
applied to x
. So eta reduction does apply, and we can eliminate x
. But not just by blinding deleting the source code text "x". What is left should be the expression that was applied to x
, which is map length
again, not the meaningless source code text length `map`
.
We do have another option with infix operators though. You can "leave out" an argument from an infix operator if you convert it to a section by wrapping the thing in parentheses. So we could write map
applied to length
and not apply the result to a second argument by writing (length `map`)
, if you want to keep the result looking similar to what you started with. This also gives another way of eliminating arguments to get to point free code: we can leave out the first argument of an infix operator just as easily as the second. For example if we had:
f x = x / 3
This is (/)
applied to x
, and then the result of that applied to 3
. So the argument of the outermost application is 3
, not x
, and we can't apply eta reduction. But we can apply a very similar process because of Haskell's operator section syntax2, and eliminate x
to leave (/ 3)
. Again, note that the important thing to consider is the structure of the expression x / 3
, not the order of the bits of source code text we use to write it down.
Enough of that digression. Your last example is a little more complicated:
agreeLen :: (Eq a) => [a] -> [a] -> Int
agreeLen x y = length $ takeWhile id $ zipWith (==) x y
y
(and then x
if we could do this twice) is at the end of the source code text. But it is not the argument of the top-level function application in this expression.
One quirk of Haskell's syntax is that if there are operators involved in an expression (unless they're inside a parenthesised sub-expression), one of them is always the top-level application. This is because normal function application is higher-precedence than any possible operator, and it's the lowest precedence thing that ends up being the top-level expression. So just from the presence of the $
operators it's immediately obvious that one of them has to be the "top" function, and there's no way either of them could be applied to just y
on its own. So eta reduction isn't going to apply immediately.
In more detail, this is $
applied to length
, and then to everything on the right. The top-level application is therefore ($) length
being applied to takeWhile id $ zipWith (==) x y
; y
is deep inside the argument expression, we need the argument of the top-level application to just be simply y
.
But remember that f $ x
is just another way of writing f x
, usually used only so we don't have to put parentheses around f
or x
if those are complex expressions. If we put the parentheses back we get closer to having y
at the top level.
agreeLen x y = length (takeWhile id (zipWith (==) x y))
We're still not there. But this is "length
applied to (takeWhile id
applied to (zipWith (==) x
applied to y
)). This pattern of "chaining" things applied to the result of other applications is exactly what function composition with the .
operator is for. We can mentally reframe this as "length
applied after takeWhile id
applied after zipWith (==) x
, and the whole of that chain applied to y
". Written in code as:
agreeLen x y = (length . takeWhile id . zipWith (==) x) y
Now we've got <something>
applied to y
! So we can eta-reduce:
agreeLen x = length . takeWhile id . zipWith (==) x
Note that x
, although now textually last, does not fit the same pattern as y
did above. We don't have parentheses around the whole "function-pipeline" before it's applied to x
, rather x
is part of the last stage in the pipeline. We can shuffle things around to eventually get:
agreeLen x = (((length . takeWhile id) .) . zipWith (==)) x
and then eta reduce to eliminate x
, as shown in other answers. But the steps are more complicated; I don't even know what they are, I just copied that expression from @chepner's answer (who copied it from http://pointfree.io). This shows a key difference between the role of x
and y
in the original agreeLen
, even though they both appeared at the end of source code in the right order.
Which just re-emphasises my point: to understand why you can sometimes eta reduce and sometimes not you need to not think of eta reduction as a simple textual rule; you do actually have to read and understand the structure of the expressions you're trying to rewrite.
1 Actually in pure lambda calculus you can apply this almost as a blind textual rule; Haskell adds more complicated syntax like infix operators with implicit parentheses via precedence/associative, constructs like case, if/then/else, etc.
2 Infix operator syntax doesn't exist in pure lambda calculus, let alone operator sections. So there's no fancy Greek-letter name for this one, but that's only because the theoreticians who named these rules originally weren't studying Haskell.