15

Consider the words Prelude function; it is really easy and one could write it in the following manner:

words' :: String -> [String]
words' [] = []
words' str = before : words' (dropWhile isSpace after) where
    (before, after) = break isSpace str

However, I noticed that its original Prelude code seems much less... natural:

words                   :: String -> [String]
words s                 =  case dropWhile {-partain:Char.-}isSpace s of
                                "" -> []
                                s' -> w : words s''
                                      where (w, s'') =
                                             break {-partain:Char.-}isSpace s'

I assume that there are optimization-related reasons for it. The question is: am I wrong to expect that the compiler should optimize the words' function just as well as its Prelude version? I did use the same functions (break, dropWhile, isSpace).

I was once very surprised that GHC did not perform some of the simplest low-level optimizations:

C vs Haskell Collatz conjecture speed comparison

but aside for the {-partain:Char.-} bits (this hint for the compiler does not seem very helpful in this situation IMO) the words code seems unnecesarily bloated for a high-level language. What is the reason behind it in this case?

Community
  • 1
  • 1
ljedrz
  • 20,316
  • 4
  • 69
  • 97
  • 5
    I don't think the `{-partain:Char.-}` bit is anything more than a commented-out module name, by the way. According to Google, someone with last name Partain worked on GHC some time ago. I'm guessing that was just him signing his comments. – hammar May 08 '13 at 17:35
  • Oh, I thought that it might have had some influence on the compiler. Nice catch with the Partain guy! – ljedrz May 08 '13 at 17:58

1 Answers1

12

This is nearly exactly the same code. The only difference is if we're doing the dropWhile isSpace before every call or only the recursive call. Neither is more complex than the other, but rather the latter (Prelude) version seems more verbose because the pattern matching isn't directly in the function.

You can observe the difference (and why the Prelude version has better behavior) like so:

*Main> words "    "
[]
*Main> words' "     "
[""]

Note that you can quickly verify if your "improved" versions are the same as the originals using QuickCheck.

sclv
  • 38,665
  • 7
  • 99
  • 204
  • Well, I don't think my version is in any way better than the original, but that is the point; to me it seemed that the only difference was verbosity. – ljedrz May 08 '13 at 17:55
  • I had another look in `Prelude` and some other core libraries and I must say that verbosity is in fact quite rare in function definitions (and is probably explained by cases like the one you described). The original question title was an overstatement :). – ljedrz May 08 '13 at 21:25