1

I'm writing a function that remove white space in a json string. I need to know if the current char I'm processing is surrounded by ", or if it is after a escape char \. So I need two more param for this function.

Here is the current implementation. But I don't think it is lazy. How could I make it lazy with "filter" or "map" on the json string?

compressJson :: String -> String
compressJson json = compress json False False ""
    -- compress params: json, inStr, aferEscape, acc
    where compress []          _     _     acc = acc
          compress ('\"' : xs) inStr False acc = compress xs (not inStr) False (acc ++ "\"")
          compress ('\\' : xs) inStr False acc = compress xs inStr       True  acc
          compress (x    : xs) inStr True  acc = compress xs inStr       False (acc ++ ['\\', x])
          compress (x    : xs) True  False acc = compress xs True        False (acc ++ [x])
          compress (x    : xs) False _     acc = compress xs False       False (acc ++ parse x)

          parse c = if c `elem` " \t\n"
              then []
              else [c]
Bin Wang
  • 2,697
  • 2
  • 24
  • 34
  • @user2407038 if you do the changes as you say but leave it tail recursive, it'll still force the whole input through; and then using `reverse` will force the output through - so it's the opposite of what the OP asks for. -- "multiple elements" are usually handled through `concatMap :: (a -> [b]) -> [a] -> [b]`. – Will Ness Jun 01 '14 at 06:53

2 Answers2

9

This is actually trivial to make lazy - don't make it tail-recursive.

More or less like this (I didn't test it)

compressJson :: String -> String
compressJson json = compress json False False
    -- compress params: json, inStr, aferEscape
    where compress []          _     _     = ""
          compress ('\"' : xs) inStr False = '\"' : compress xs (not inStr) False
          compress ('\\' : xs) inStr False = compress xs inStr True
          compress (x    : xs) inStr True  = '\\' : x : compress xs inStr False
          compress (x    : xs) True  False = x : compress xs True False
          compress (x    : xs) False _     = parse x ++ compress xs False False

          parse c = if c `elem` " \t\n"
              then []
              else [c]

Tail recursion and laziness are at direct odds with each other. Tail recursion implies that a function is a call to itself with modified parameters. Laziness implies that it returns a constructor promptly (ie, not after some indeterminate number of recursive calls). So if you want a function to be lazy, write it to return partial results immediately.

Carl
  • 26,500
  • 4
  • 65
  • 86
2

The word "lazy" in the question means we want a function that is "least forcing on the input and most productive in the output". Same word is used in "Haskell's lazy evaluation", so it might be confusing.

Productive functions can be encoded with guarded recursion as in Carl's great answer. It follows a foldr pattern w.r.t. the input string, but foldl w.r.t. the other 2 parameters. In other words it constructs its output from right to left, but to do so it needs to pass the controlling params from left to the right. Hence the OP question of how to encode this with a HOF ("make it lazy with filter or map...").

Guarded recursion is closely related to corecursion. Corecursion is, basically, unfolding. Recursion is "going back", but corecursion is "going forward". The gradual consumption of input, bit by bit, can be seen as "going forward" along it too. So we shall use unfoldr, and consume its output with concat (to accommodate the need to skip, or sometimes produce more than one elements in the output).

Thus we get the efficient code from your clear and easy-to-read formulation by a purely mechanical manipulation, because the accumulating argument technique you've used is an expression of iteration, just as corecursion is too:

import Data.List (unfoldr)

compressJson :: String -> String
compressJson json = -- g json False False ""
                    concat $ unfoldr h (json,False,False)
  where 
    {- g params: json, inStr, aferEscape, acc
    g []          _     _     acc = acc
    g ('\"' : xs) inStr False acc = g xs (not inStr) False (acc ++ "\"")
    g ('\\' : xs) inStr False acc = g xs inStr       True   acc
    g (x    : xs) inStr True  acc = g xs inStr       False (acc ++ ['\\', x])
    g (x    : xs) True  False acc = g xs True        False (acc ++ [x])
    g (x    : xs) False _     acc = g xs False       False (acc ++ parse x)
    -}
    parse c = [c | not (c `elem` " \t\n")]

    h ([]       , _    , _    ) = Nothing
    h ('\"' : xs, inStr, False) = Just ("\"",     (xs, not inStr, False))
    h ('\\' : xs, inStr, False) = Just ([]  ,     (xs,     inStr,  True))
    h (x    : xs, inStr, True ) = Just (['\\', x],(xs,     inStr, False))
    h (x    : xs, True , False) = Just ([x],      (xs,      True, False))
    h (x    : xs, False, _    ) = Just (parse x,  (xs,     False, False))

see also:

duplode
  • 33,731
  • 7
  • 79
  • 150
Will Ness
  • 70,110
  • 9
  • 98
  • 181