4

Say you have a list of numbers, [1,2,3,5,6,7,8,9,11,12,15,16,17]

and you want a function that takes that as an input and returns something like

[[1,3],[5,9],[11,12],[15,17]] or alternatively maybe [(1,3), (5,9), (11,12), (15,17)]

how would this be done? all of the solutions i've found online are very very long and quite convoluted, when this seems like such an easy problem for a functional language like haskell

ar1a
  • 374
  • 3
  • 10
  • 1
    Can you provide one such very very long and quite convoluted solution? Perhaps then we can focus on shortening it or look at where the standard library or another solution strategy may help in shortening the solution. – sshine Dec 01 '19 at 05:19
  • https://www.rosettacode.org/wiki/Range_extraction#Haskell – ar1a Dec 01 '19 at 05:20
  • Is your actual goal to solve that Rosetta Code problem? – dfeuer Dec 02 '19 at 04:13

3 Answers3

4

So we have a list of numbers,

xs = [1,2,3,5,6,7,8,9,11,12,14,16,17]     -- 14 sic!

We turn it into a list of segments,

ys = [[x,x+1] | x <- xs]
  -- [[1,2], [2,3], [3,4], [5,6], ..., [11,12], [12,13], [14,15], [16,17], [17,18] ]

we join the touching segments,

zs = foldr g [] ys
  -- [[1,4], [5,10], [11,13], [14,15], [16,18]]
  where
  g [a,b] []                  =  [[a,b]]
  g [a,b] r@([c,d]:t) | b==c  =  [a,d]:t
                 | otherwise  =  [a,b]:r

and we subtract 1 from each segment's ending value,

ws = [[a,b-1] | [a,b] <- zs]
  -- [[1,3], [5,9], [11,12], [14,14], [16,17]]

All in all we get

ranges :: (Num t, Eq t) => [t] -> [[t]]
ranges  =  map (\[a,b] -> [a,b-1]) . foldr g [] . map (\x -> [x,x+1]) 
  where
  g [a,b] []                  =  [[a,b]]
  g [a,b] r@([c,d]:t) | b==c  =  [a,d]:t
                 | otherwise  =  [a,b]:r

Simple and clear.

edit: or, to be properly lazy,

  where
  g [a,b] r = [a,x]:y
       where
       (x,y) = case r of ([c,d]:t) | b==c  -> (d,t)   -- delay forcing
                         _                 -> (b,r)

update: as dfeuer notes, (a,a) type is better than [a,a]. Wherever [P,Q] appears in this code, replace it with (P,Q). This will improve the code, with zero cost to readability.

Will Ness
  • 70,110
  • 9
  • 98
  • 181
  • `-- 14 sic!` :). Yet `[14..14]` gives just `[14]` if you attempt to reconstruct the original list. So shouldn't be a problem. ie `foldMap (\[x,y] -> [x..y])` – Redu Dec 01 '19 at 17:25
  • yes, the representation of singleton ranges wasn't specified in the Q. as long as all the needed information is present, it doesn't matter. BTW as written, the segments-joining `foldr` isn't lazy (i.e. top-down, or left-to-right), as it should be. (`head.head.ranges $ [1..]` shouldn't hang). food for thought! (or just some `~` patterns thrown around for good measure). – Will Ness Dec 01 '19 at 22:59
  • I think it's cleaner to go `a -> (a,a)` to start (map the diagonal function). Then you can merge up on endpoints and not monkey around with the surprise addition and subtraction. – dfeuer Dec 02 '19 at 04:03
  • Also, the representation using pairs, or `Either a (a,a)`, is much better than one using lists. – dfeuer Dec 02 '19 at 04:12
  • of course it's better, as it's much more tight. I meant "doesn't matter" as in "doesn't lose anything". "monkeying around" that's the playful fun of FP! the austerity of tight types and tight codes in Haskell is another layer on top of that. :) and I definitely had in mind another possible processing step converting the lists into whatever, as *the OP would want*, all the interim steps hidden well inside. (what do you mean by map the diagonal? I actually envisioned the short 1-intervals arranged as the diagonal, when I thought it up) – Will Ness Dec 02 '19 at 05:19
  • @dfeuer (contd.) re addition, I thought about zipping with tail but that didn't immediately work with the singleton intervals, and the addition did. also, my goals were correctness *and* simplicity and immediate clarity of the solution. that's why the first one isn't even properly lazy - that one already is more obscure. – Will Ness Dec 02 '19 at 05:20
  • To be really tight and also properly lazy is a little trickier, and I believe it requires a departure from Haskell lists as a result type. – dfeuer Dec 02 '19 at 05:31
  • @dfeuer isn't my 2nd solution as lazy as yours? I thought it is. also, I don't see any difference between `(a,a)` and `[a,a]`. – Will Ness Dec 02 '19 at 05:40
  • All the solutions look quite good, but i think yours wins out! thanks! – ar1a Dec 02 '19 at 06:09
  • @dfeuer ah, wait, of course there's a difference. (-‸ლ) But I have a feeling it makes no difference here in this specific algorithm. any access to the second element forces the input list till the next segment's start is discovered, and that automatically closes the previous segment list anyway. the drawback might be that the `[a,a]` allows a nonsensical call like `length . head . ranges` which is impossible with the `(a,a)`, as it should be. perhaps that's what you meant. – Will Ness Dec 02 '19 at 07:05
  • (that (-‸ლ) was Google's suggestion for the ASCII variant of , if you were wondering). – Will Ness Dec 02 '19 at 12:02
  • I have just discovered by chance that I had already done [a singleton version](https://stackoverflow.com/a/53191704/4543207) under a very similar question. I must be going amnesic :) – Redu Dec 02 '19 at 12:46
  • @Redu I also answered a few questions like that in the past. the first one I remember vaguely also had [an answer](https://stackoverflow.com/a/14403413/849891) by [Daniel Fischer](https://stackoverflow.com/users/1011995/daniel-fischer), with a lazy foldr-based solution. of course here the ranges are collapsed to two endpoints, so it's little bit different, though yes, not that much. – Will Ness Dec 02 '19 at 17:03
1

So one might do it like the idea from @Will Ness on the stateful folding or mine under the same answer. All explanations are to be found there. Besides, if you get curious and want to read more about it then have a look at Haskell Continuation Passing Style page. I am currently trying to gerealize this in such a way that we can have a variant of foldr1 in a stateful manner. A foldS :: Foldable t => (a -> a -> b) -> t a -> b. However this is still not general stateful folding. It's just tailored to this question.

ranges :: (Ord a, Num a) => [a] -> [[a]]
ranges xs = foldr go return xs $ []
            where
            go :: (Ord a, Num a) => a -> ([a] -> [[a]]) -> ([a] -> [[a]])
            go c f = \ps -> let rrs@(r:rs) = f [c] 
                            in case ps of 
                               []  -> [c]:r:rs
                               [p] -> if p + 1 == c then rrs else [p]:(c:r):rs

*Main> ranges [1,2,3,5,6,7,8,9,11,12,15,16,17]
[[1,3],[5,9],[11,12],[15,17]]

I haven't had time to test any edge cases. All advices are welcome.

Redu
  • 25,060
  • 6
  • 56
  • 76
1

I would definitely prefer the alternative representation to the first one you give.

ranges :: (Num a, Eq a) => [a] -> [(a,a)]
ranges [] = []
ranges (a : as) = ranges1 a as

-- | A version of 'ranges' for non-empty lists, where
-- the first element is supplied separately.
ranges1 :: (Num a, Eq a) => a -> [a] -> [(a,a)]
ranges1 a as = (a, b) : bs
  where
    -- Calculate the right endpoint and the rest of the
    -- result lazily, when needed.
    (b, bs) = finish a as

-- | This takes the left end of the current interval
-- and the rest of the list and produces the right endpoint of
-- that interval and the rest of the result.
finish :: (Num a, Eq a) => a -> [a] -> (a, [(a, a)])
finish l [] = (l, [])
finish l (x : xs)
  | x == l + 1 = finish x xs
  | otherwise = (l, ranges1 x xs)

To solve the Rosetta Code problem linked in the comment above, this isn't really quite an optimal representation. I'll try to explain how to match the representation more precisely later.

dfeuer
  • 48,079
  • 5
  • 63
  • 167