Frnakly, it looks like black magic to me. Granted, my knowledge of Haskell is fairly limited, but I can understand basic principles. Just because it makes matters that much easier I try to work through a test input, but in this case it still doesn't make sense.
Background: About a year ago I posted this question (Are recursive calls in my "permutations with repetition" code accumulated to clog the RAM?) about a piece of permutation generation code I was trying to get working. I received quite a lot of help from this wonderful community and went on with my project. However, there was one last answer I didn't think much of at the time, from user "peter pun" that I revisited recently.
Part of the answer, the important one, that generates permutations is this
permutationsNub :: Eq a => [a] -> [[a]]
permutationsNub = foldr (concatMap . insert) [[]]
where insert y = foldr combine [[y]] . (zip <*> tail . tails) -- tails through importing Data.List
where combine (x, xs) xss = (y : x : xs) :
if y == x then [] else map (x :) xss
Allow me to reform the function in a way that makes more sense for me, so that I can see every argument and refer to it, because of partial application in the original and variable names that confused me. Please tell me if the interpretation is correct.
permutationsNub :: Eq a => [a] -> [[[a]]]
permutationsNub digits = scanr (concatMap . insert) [[]] digits
where insert digit perms = foldr combine [[digit]] $ (zip <*> tail . tails) perms -- zip will produce tuples of the form (a, [a])
where combine (permX, tailOfPerms) digitAsDoubleList = (digit : permX : tailOfPerms) :
if digit == permX then [] else map (permX :) digitAsDoubleList
I replaced the main foldr function with scanr (the output type also) to see intermediate steps, so if I run permutationNub [2,7,7,8]
the result is [[[2,7,7,8],[7,2,7,8],[7,7,2,8],[7,7,8,2],[2,7,8,7],[7,2,8,7],[7,8,2,7],[7,8,7,2],[2,8,7,7],[8,2,7,7],[8,7,2,7],[8,7,7,2]],[[7,7,8],[7,8,7],[8,7,7]],[[7,8],[8,7]],[[8]],[[]]]
which gives some insight.
So, I have to ask:
First of all, why is applicative notation (
<*>
) used next tozip
? Is it dictated by the use of point free notation? How do applicatives creep into this?My renaming of variables so that they make sense to me seems fine, because the function works, aka produces the same output as the original. However, some things are strange, e.g how can
digit
ever be equal topermX
as per the comparison check in thecombine
function? The former is an Int and the latter is a [Int], otherwise cons-ing likedigit:permX:...
in the same function wouldn't work, right? For example, if we callpermutationNub [2,7,7,8]
, on the second pass of the main foldr (or scanr) function, digit will be7
and permX will[8]
, so how can these two ever be equal?It gets stranger from here. Well, both of them can then be cons-ed to tailOfPerms, which is an [[Int]], but then this whole thing, enclosed in brackets is cons-ed to an ... [[Int]]? Again?
Actually, map (permX :) digitAsDoubleList
doesn't seem to work (yet, somehow it does...) because we're trying to map [Int] to a [[Int]], as is the result, and then cons to that the previous [[Int]], which I don't see how it could fit.
For example, on the second pass of the permutationNub, with 7 as the next digit, picked by the main foldr/scanr function, first pass of
insert` will be (7:[8]:[]):[[8]:[[7]]], which I can't make sense of.
- Note: As a higher level explanation the author of the code says The basic idea is that, given the unique permutations of a list's tail, we can compute the unique permutations of the whole list by inserting its head into all of the tail's permutations, in every position that doesn't follow an occurrence of the head (to avoid creating duplicates). So, as an example, let's say that current permutations of the tail are [[7,8],[8,7]] (see pass 2) and the next digit is 7. So, the algorithm will insert 7 at the beginning of the first permutation (producing [7,7,8], then at the end (producing [7,8,7] and then only at the beginning of the second permutation (producing [8,7,7])? How can it tell that it doesn't need to elsewhere? What does the author mean by in every position that doesn't follow an occurrence of the head?
If anyone can shed some light into this piece of coding wizardry, please do so!