interleaving two strings, preserving order: functional style

Question

In this question, the author brings up an interesting programming question: given two string, find possible 'interleaved' permutations of those that preserves order of original strings.

I generalized the problem to n strings instead of 2 in OP's case, and came up with:

-- charCandidate is a function that finds possible character from given strings.
-- input : list of strings
-- output : a list of tuple, whose first value holds a character 
-- and second value holds the rest of strings with that character removed
-- i.e ["ab", "cd"] -> [('a', ["b", "cd"])] ..

charCandidate xs = charCandidate' xs []
charCandidate' :: [String] -> [String] -> [(Char, [String])]
charCandidate' [] _ = []     
charCandidate' ([]:xs) prev = 
    charCandidate' xs prev
charCandidate' (x@(c:rest):xs) prev =
    (c, prev ++ [rest] ++ xs) : charCandidate' xs (x:prev)

interleavings :: [String] -> [String]
interleavings xs = interleavings' xs []    

-- interleavings is a function that repeatedly applies 'charCandidate' function, to consume
-- the tuple and build permutations.
-- stops looping if there is no more tuple from charCandidate.

interleavings' :: [String] -> String -> [String]
interleavings' xs prev = 
    let candidates = charCandidate xs
        in case candidates of
            [] -> [prev]
            _  -> concat . map (\(char, ys) -> interleavings' ys (prev ++ [char])) $ candidates

-- test case
input :: [String]
input = ["ab", "cd"]    
-- interleavings input == ["abcd","acbd","acdb","cabd","cadb","cdab"]

it works, however I'm quite concerned with the code:

it is ugly. no point-free!
explicit recursion and additional function argument prev to preserve states
using tuples as intermediate form

How can I rewrite the above program to be more "haskellic", concise, readable and more conforming to "functional programming"?

I wouldn't recommend pursuing point-free style too far. It may be educational and even fun, but if taken beyond a certain point, point-free programming can make things less readable, not more... — comingstorm, Mar 31 '16 at 20:29
Also, please avoid the likes of `(:) (...) $ ...` in favor of `(...):(...)`. In the above case, you don't even need the second set of parens... — comingstorm, Mar 31 '16 at 20:48
I haven't tested, but I suspect you meant to write `charCandidate'` everywhere you currently have `getCandidate`, and meant to write `charCandidate'` everywhere you currently have `charCandidate` in the five lines following the type declaration for `charCandidate'`. — Daniel Wagner, Mar 31 '16 at 20:59
@DanielWagner sorry, I've changed some names while copypasting from my editor and it seems I forgot to make my edit match with name-changes. — thkang, Mar 31 '16 at 21:06

Daniel Wagner · Accepted Answer · 2016-03-31T20:53:00.240

I think I would write it this way. The main idea is to treat creating an interleaving as a nondeterministic process which chooses one of the input strings to start the interleaving and recurses.

Before we start, it will help to have a utility function that I have used countless times. It gives a convenient way to choose an element from a list and know which element it was. This is a bit like your charCandidate', except that it operates on a single list at a time (and is consequently more widely applicable).

zippers :: [a] -> [([a], a, [a])]
zippers = go [] where
    go xs [] = []
    go xs (y:ys) = (xs, y, ys) : go (y:xs) ys

With that in hand, it is easy to make some non-deterministic choices using the list monad. Notionally, our interleavings function should probably have a type like [NonEmpty a] -> [[a]] which promises that each incoming string has at least one character in it, but the syntactic overhead of NonEmpty is too annoying for a simple exercise like this, so we'll just give wrong answers when this precondition is violated. You could also consider making this a helper function and filtering out empty lists from your top-level function before running this.

interleavings :: [[a]] -> [[a]]
interleavings [] = [[]]
interleavings xss = do
    (xssL, h:xs, xssR) <- zippers xss
    t <- interleavings ([xs | not (null xs)] ++ xssL ++ xssR)
    return (h:t)

You can see it go in ghci:

> interleavings ["abc", "123"]
["abc123","ab123c","ab12c3","ab1c23","a123bc","a12bc3","a12b3c","a1bc23","a1b23c","a1b2c3","123abc","12abc3","12ab3c","12a3bc","1abc23","1ab23c","1ab2c3","1a23bc","1a2bc3","1a2b3c"]
> interleavings ["a", "b", "c"]
["abc","acb","bac","bca","cba","cab"]
> permutations "abc" -- just for fun, to compare
["abc","bac","cba","bca","cab","acb"]

@thkang Ah, right, so: `[e1 | e2]` is a list that contains element `e1` if `e2` evaluates to `True` and an empty list if `e2` evaluates to `False`. So it's shorthand for `if hay xs then [xs] else []`. Cute, no? Anyway the purpose is to maintain the invariant that we pass only non-empty lists to `interleavings`. — Daniel Wagner, Mar 31 '16 at 20:44
thanks, that was the only part I couldn't understand - I've read about zippers but it's nice to see a real application here (I'm quite new to haskell). — thkang, Mar 31 '16 at 20:50
For posterity: thkang's earlier question mentions `[xs | hay xs]`. This is because originally I had defined `hay = not . null`; but I decided it wasn't worth splitting out into its own function and inlined it, so the new spelling is `[xs | not (null xs)]`. — Daniel Wagner, Mar 31 '16 at 20:54
I've come up with an implementation that seems to be much faster when dealing with long lists. Could you possibly take a look and see if you have any suggestions (especially for making it prettier)? — dfeuer, Apr 03 '16 at 01:49

dfeuer · Answer 2 · 2016-04-03T03:50:18.983

This is fastest implementation I've come up with so far. It interleaves a list of lists pairwise.

interleavings :: [[a]] -> [[a]]
interleavings = foldr (concatMap . interleave2) [[]]

This horribly ugly mess is the best way I could find to interleave two lists. It's intended to be asymptotically optimal (which I believe it is); it's not very pretty. The constant factors could be improved by using a special-purpose queue (such as the one used in Data.List to implement inits) rather than sequences, but I don't feel like including that much boilerplate.

{-# LANGUAGE BangPatterns #-}
import Data.Monoid
import Data.Foldable (toList)
import Data.Sequence (Seq, (|>))

interleave2 :: [a] -> [a] -> [[a]]
interleave2 xs ys = interleave2' mempty xs ys []

interleave2' :: Seq a -> [a] -> [a] -> [[a]] -> [[a]]
interleave2' !prefix xs ys rest =
  (toList prefix ++ xs ++ ys)
     : interleave2'' prefix xs ys rest

interleave2'' :: Seq a -> [a] -> [a] -> [[a]] -> [[a]]
interleave2'' !prefix [] _ = id
interleave2'' !prefix _ [] = id
interleave2'' !prefix xs@(x : xs') ys@(y : ys') =
  interleave2' (prefix |> y) xs ys' .
      interleave2'' (prefix |> x) xs' ys

score 1 · Answer 3 · answered Apr 03 '16 at 08:35

1

Using foldr over interleave2

interleave :: [[a]] -> [[a]]
interleave = foldr ((concat .) . map . iL2) [[]]  where 
   iL2 [] ys = [ys]
   iL2 xs [] = [xs]
   iL2 (x:xs) (y:ys) = map (x:) (iL2 xs (y:ys)) ++ map (y:) (iL2 (x:xs) ys)

answered Apr 03 '16 at 08:35

Luka Rahne

10,336
3
34
56

dvaergiller · Answer 4 · 2016-04-01T13:17:14.030

0

Another approach would be to use the list monad:

interleavings xs ys = interl xs ys ++ interl ys xs where
  interl [] ys = [ys]
  interl xs [] = [xs]
  interl xs ys = do
    i <- [1..(length xs)]
    let (h, t)  = splitAt i xs
    map (h ++) (interl ys t)

So the recursive part will alternate between the two lists, taking all from 1 to N elements from each list in turns and then produce all possible combinations of that. Fun use of the list monad.

Edit: Fixed bug causing duplicates

Edit: Answer to dfeuer. It turned out tricky to do code in the comment field. An example of solutions that do not use length could look something like:

interleavings xs ys = interl xs ys ++ interl ys xs where 
  interl [] ys = [ys]
  interl xs [] = [xs]
  interl xs ys = splits xs >>= \(h, t) -> map (h ++) (interl ys t)

splits [] = []
splits (x:xs) = ([x], xs) : map ((h, t) -> (x:h, t)) (splits xs)

The splits function feels a bit awkward. It could be replaced by use of takeWhile or break in combination with splitAt, but that solution ended up a bit awkward as well. Do you have any suggestions?

(I got rid of the do notation just to make it slightly shorter)

edited Apr 01 '16 at 13:17

answered Apr 01 '16 at 09:42

dvaergiller

795
3
11

Can you see a way to avoid `length`? It's not really needed here. – dfeuer Apr 01 '16 at 12:12
You could lazily produce all possible ways to split the list and then draw elements from that. I tried one version but it did not make it more readable, and I am not sure it would be faster either. – dvaergiller Apr 01 '16 at 12:57
Just realised that this solution is very similar to that of Daniel Wagner – dvaergiller Apr 01 '16 at 13:21
Another option (better, I think) is to recognize that interleaving can be done pairwise. – dfeuer Apr 01 '16 at 19:27
Hm. Please elaborate – dvaergiller Apr 02 '16 at 20:23

Rotsor · Answer 5 · 2016-04-03T21:08:24.830

0

Combining the best ideas from the existing answers and adding some of my own:

import Control.Monad

interleave [] ys = return ys
interleave xs [] = return xs
interleave (x : xs) (y : ys) =
  fmap (x :) (interleave xs (y : ys)) `mplus` fmap (y :) (interleave (x : xs) ys)

interleavings :: MonadPlus m => [[a]] -> m [a]
interleavings = foldM interleave []

This is not the fastest possible you can get, but it should be good in terms of general and simple.

edited Apr 03 '16 at 21:08

answered Apr 03 '16 at 05:19

Rotsor

13,655
6
43
57

It's pretty, and it was my original answer, but it has some performance problems. Left-nested `++` and lots of repeated `map` applications will make this slow. – dfeuer Apr 03 '16 at 05:48
Indeed. Such inefficiencies are dwarfed by the output size though so they are unlikely to matter much. – Rotsor Apr 03 '16 at 06:01
Edited the answer slightly so that `interleavings` works for any `MonadPlus` now. This means you can make it faster just by using a different monad! – Rotsor Apr 03 '16 at 06:21
Better, but the nested maps are still an issue, and you have to get to the end of the first list before you produce anything at all. I like your `mplus` idea, but I don't think it's sufficient. – dfeuer Apr 03 '16 at 11:46

interleaving two strings, preserving order: functional style

5 Answers5

Linked