Well, first of all, we can rewrite merge to be a little more elegant using pattern matching
merge [] ys = ys
merge xs [] = xs
merge xs@(x:xs1) ys@(y:ys1)
| x <= y = x : merge xs1 ys
| otherwise = y : merge xs ys1
In general you should avoid using head
and tail
since they are a bit unsafe (they raise an error for the empty list) and use pattern matching whenever possible.
The implementation of msort
is pretty much spot on, except that we can split the list in a more efficient way. That's because length xs
- takes O(N) to complete. The compiler might save you and cache the result of the length
call so that the second call to length
won't traverse the list again. But the take
and drop
will pretty much cause another two traversals thus splitting the list using 3 traversals which may prove to be expensive. We can do better by splitting the list in two lists - the first one containing the elements on the odd positions and the second list with the elements placed on the even positions, like so:
msort [] = []
msort [x] = [x]
msort xs = merge (msort first) (msort second)
where
(first, second) = splitInHalves xs
splitInHalves [] = ([], [])
splitInHalves [x] = ([x], [])
splitInHalves (x:y:xs) =
let (xs1, ys1) = splitInHalves xs
in (x:xs1, y:ys1)
This gets you the same Merge Sort in O(NlogN) time. It feels different because you would probably implement it in place (by modifying the original list) in an imperative language such as C. This version is slightly more costly on the memory, but it does have it's advantages - it is more easy to reason about, so it is more maintainable, and also it is very easy to parallelize without being concerned of anything else except the algorithm itself - which is exactly what a good programming language should provide for the developers that use it.
EDIT 1 :
If the syntax is a bit much, here are some resources:
- Pattern Matching - the bit with the
@
symbol is called an as-pattern. You'll find it in there
let
is a keyword used to declare a variable to be used in the expression that follows it (whereas where
binds a variable in the expression that precedes it). More on Haskell syntax, including guards (the things with | condition = value
) can be found here, in this chapter of Learn You a Haskell
EDIT 2 :
@is7s proposed a far more concise version of splitInHalves
using the foldr
function:
splitInHalves = foldr (\x (l,r) -> (x:r,l)) ([],[])
EDIT 3 :
Here is another answer which provides an alternative implementation of merge sort, which also has the property of being stable:
Lazy Evaluation and Time Complexity
Hope this helps and welcome to the wonderful world of Functional Programming !