4

I tried to implement words function from Data.List, but my implementation doesn't work exactly as I wish.

For example, if the function's input is "tere vana kere" then the output is ["vana", "kere"] and it misses the first word. But when I add space in front of my input " tere vana kere" then the output is correct ["tere", "vana", "kere"]

Could someone point out the problem. Thank You

words' :: String -> [String]
words' xs = snd $ foldr (\x acc -> if isSpace x then 
                                    if null (fst acc) then
                                        acc
                                    else
                                        ([], (fst acc): (snd acc)) 
                               else 
                                     (x:fst acc, snd acc)   
                               ) ([],[]) xs
Rene
  • 289
  • 1
  • 4
  • 8

3 Answers3

3

OK, so let's try this:

step x acc =
  if isSpace x
    then
      if null (fst acc)
        then acc
        else ([], (fst acc) : (snd acc))
    else (x : fst acc, snd acc)

words' xs = snd $ foldr step ([], []) xs

Now let's walk this through, one step at a time: Suppose we want words' "ABC DEF GHI". We can do it like this:

Prelude> step 'I' ([], [])
("I", [])
Prelude> step 'H' it
("HI", [])
Prelude> step 'G' it
("GHI", [])
Prelude> step ' ' it
("", ["GHI"])
Prelude> step 'F' it
("F", ["GHI"])
Prelude> step 'E' it
("EF", ["GHI"])
Prelude> step 'D' it
("DEF", ["GHI"])
Prelude> step ' ' it
("", ["DEF","GHI"])
Prelude> step 'C' it
("C", ["DEF","GHI"])
Prelude> step 'B' it
("BC", ["DEF","GHI"])
Prelude> step 'A' it
("ABC", ["DEF","GHI"])
Prelude> snd it
["DEF","GHI"]

Do you see the problem here?

The trouble is, you only "flush" the current word into the word list when you see a space character. In particular, you don't flush when you see the end of the input. You can fix that by replacing snd:

words' xs = (\ (w, ws) -> w:ws) $ foldr step ([], []) xs

As an aside, congrats on making the code correctly handle multiple consecutive spaces. :-)

EDIT: To preserve that nice property:

words' xs = (\ (w, ws) -> if null w then ws else w:ws) $ ...
MathematicalOrchid
  • 61,854
  • 19
  • 123
  • 220
  • You have a typo: `css`. And this doesn't seem to work well for input with leading spaces (" a b c"). It will add empty element at the beginning. – Piotr Praszmo Dec 19 '14 at 19:13
  • is it productive enough on infinite lists? (haven't read through the full answer yet, sorry) `if null (fst acc)` is suspicious... (`words $ cycle "s"` should print `["sssssssss ...`). cf. http://stackoverflow.com/questions/14403293/need-to-partition-a-list-into-lists-based-on-breaks-in-ascending-order-of-elemen/14451312#comment27363635_14451312 and the answers there: the trick is to create the shape first, and only then to fill its holes, afterwards. – Will Ness Dec 19 '14 at 20:14
  • @WillNess I suspect you can't fix that with a _right_ fold. I haven't tested it, but I would expect the same strictness properties as the OP's original code. – MathematicalOrchid Dec 19 '14 at 20:25
  • `words xs = (\ws@(w:r)->if null w then r else ws) $ foldr g [[]] xs where g c r@ ~(a:b) | isSpace c && null a = r | isSpace c = []:r | otherwise = (c:a):b`. this works correctly, so I guess your code must be as well - they are very similar. so there was no problem, my mistake. – Will Ness Dec 20 '14 at 09:24
0

Wrt the above answer: Instead of flushing the current word at the end of the string using separate checks, why not just append a space at the beginning of the string.

step x acc =
  if isSpace x
    then
      if null (fst acc)
        then acc
        else ([], (fst acc) : (snd acc))
    else (x : fst acc, snd acc)

words' xs = snd $ foldr step ([], []) $ ' ':xs
sushobhana
  • 1
  • 1
  • 1
0

a humble alternative from a beginner haskeller

import           Data.Char     (isSpace)

firstWord :: String -> String
firstWord s =
  if isSpace $ head s
    then getFirstWord $ tail s
    else getFirstWord s
  where
    getFirstWord = takeWhile (/= ' ')

words' :: String -> [String]
words' s
  | null s = []
  | otherwise = first : words' rest
  where
    first = firstWord s
    rest = drop (length first + 1) s
OJOMB
  • 1