2

I've been trying to find a way to condense nested lists into numbers that go back in the original list, but I having some trouble.

I've been looking at the flatten function (that is so widely available) which is given here:

(defun flatten (l)
  (cond
    ((null l) nil)
    ((atom l) (list l))
    (t (loop for a in l appending (flatten a)))))

I understand this example is recursion, but how does it work? It check if the element is null or an atom, but what does it do if the element falls into these conditions?

Zchpyvr
  • 1,119
  • 3
  • 12
  • 26

3 Answers3

4

In my day instead of (loop for a in l appending (g a)) we wrote (mapcan #'g l). Which is equivalent to (apply #'append (mapcar #'g l)), more or less:

(defun flatten (l) 
    (if l 
        (if (atom l) 
            (list l) 
            (mapcan #'flatten l))))

So what does it mean in this case? Imagine you call (flatten (list 1 2 3 4 5)), i.e. the argument list has only atoms in it. Each atom in that list gets enclosed in a list -- becomes a singleton list, like (1) (2) etc. Then they are all appended together, giving us back ... the original list:

(  1   2   3   4   5  )

( (1) (2) (3) (4) (5) )

(  1   2   3   4   5  )

So flattening a list of atoms is an identity operation (in Common LISP, that's #'identity). Now imagine flattening a list which has some atoms in it as well as a list of atoms. Again, each element of the list gets transformed by flatten and then they are all appended together. A list of atoms stays as itself, as we just saw. Atoms get enclosed each in a list. So appending will give us back all the atoms that were on both levels in the nested list, now flattened:

(  11   12  (1 2 3 4)  13  )

( (11) (12) (1 2 3 4) (13) )

(  11   12   1 2 3 4   13  )

And so on and so forth, for more levels of nesting as well.

NILs as elements in lists pose a problem. NIL is an empty list, and empty list contains nothing, so should not contribute anything. But NIL is also an atom. So we make a special case for it, to not enclose it in a singleton list - leave it as it is, so when appended, it'll just disappear.

Will Ness
  • 70,110
  • 9
  • 98
  • 181
  • 1
    APPLY is not a good idea, since it does not work on arbitrarily long lists. – Rainer Joswig May 05 '12 at 21:10
  • @RainerJoswig just used it for illustration. – Will Ness May 05 '12 at 21:11
  • 1
    With the other answers of how the code executes and with Will Ness's explanation of the logic behind this program (which I wouldn't have gotten otherwise), I understand this concept now! – Zchpyvr May 05 '12 at 22:12
  • @RainerJoswig What is meant by `apply` not working on a long list? Should all applications of `apply` be rewritten as recursive functions if the lists are long? and how long constitutes too long to use `apply`? – johnbakers Nov 14 '13 at 11:33
  • 1
    @OpenLearner: `CALL-ARGUMENTS-LIMIT` can be as low as 50. Use `REDUCE` or something else... – Rainer Joswig Nov 14 '13 at 12:20
  • 1
    @RainerJoswig so if I see this: `CL-USER> CALL-ARGUMENTS-LIMIT 536870911` I suppose `apply` is fine, and `reduce` not necessary? – johnbakers Nov 14 '13 at 12:38
  • @OpenLearner (if I may interfere) I guess it's a judgment call. But, if you must write portable code, then probably not. – Will Ness Nov 14 '13 at 13:07
  • @OpenLearner: GCL = 64, ABCL = 50. – Rainer Joswig Nov 14 '13 at 14:26
  • @RainerJoswig this is a side point but I've had my eye on ABCL for a while, is it pretty nice on JVM? a full CL implementation? – johnbakers Nov 14 '13 at 14:28
3
(defun flatten (l)

Above defines a function FLATTEN with one argument called L.

  (cond

IF

    ((null l) nil)

the value of the argument L is the empty list, return the empty list.

    ((atom l) (list l))

or if the argument L is an atom (i.e. not a list), then return a list with the atom as its only item.

    (t 

or else we have a non-empty list

       (loop for a in l

then loop through all the items in the list which is the value of L

        appending (flatten a)

and append the results of flattening each list item.

))))
Rainer Joswig
  • 136,269
  • 10
  • 221
  • 346
2
  1. If the element you are looking at is nil - it is the end of the list, return nil.

  2. If the element you are looking at is not a list return a list containing that element (I'm not actually sure this is a right thing to do, because given an atom, which is not a list, it would return a list with the atom, rather then the atom itself).

  3. Otherwise (if it is a list), loop through all elements of it and append all the flattened sub-trees, (which you flattened by calling flatten), then return them.

This is short, but not the most efficient algorithm since append needs to go all the way to the end of the list in order to cons one part on the tail of another part. Below is slightly more convoluted, but looks like more efficient version of it:

(defun flatten (x &optional y)
  (cond
    ((null x)
     (cond
       ((null y) nil)
       ((listp (car y))
        (flatten (car y) (cdr y)))
       (t (cons (car y) (flatten (cdr y))))))
    ((listp (car x))
     (flatten (car x) (if y (list (cdr x) y) (cdr x))))
    (t (cons (car x) (flatten (cdr x) y)))))

The algorithm behind this function does something as follows:

  1. If we have neither first element, nor the rest - we did everything, so just return nil (end of the list).

  2. If there's no first element - split the second into the first and the rest and repeat.

  3. If there is the first element, cons it into the list, if there is a second element - keep it, otherwise, select the next element and repeat.

EDIT: I changed the #3, because the interpretation was really vague / could be wrong.

  • Even after staring at your efficient flatten function for a long time.... I still don't get it. I'm a greenhorn at lisp, but I'll come back at this code another day and understand your concept. Thanks! – Zchpyvr May 05 '12 at 22:22
  • your code is linear-recursive instead of tree-recursive; but on implementations without [TCO % cons](http://en.wikipedia.org/wiki/Tail_call#Tail_recursion_modulo_cons) (is there any, at all?..) it's still a full-blown recursion. Plus, it conses a lot too, recreating its input structure anew. It is possible to fix both problems in one step. [Here's one example](http://stackoverflow.com/a/10311513/849891) how. :) – Will Ness May 06 '12 at 07:21
  • I think `mapcan` won't perform any extra traversals, and I'd expect `(loop for a in l nconcing (g a))` to not do any either. The maximal recursion depth for both is the nesting depth, but for your version, which replaces looping with recursion, it will be the length of the flattened list. Even without reusing the old structure (which has its place, just should be clearly marked with e.g. `nflatten` name) there's benefits to re-writing the TCO%cons code such as yours, as a *loop* instead of recursion, e.g. w/ `do` construct, building the result list in top-down manner (to avoid `reverse`). – Will Ness May 08 '12 at 05:54
  • the code in your answer right now is tail-recursive modulo cons. It can be converted into a loop by application of the standard technique, constructing the results list in top-down manner while maintaining its ending pointer. `loop` with `nconcing` could do the same, so it would only re-traverse the result of the latest recursive call to `flatten` (a *partial* solution). For a full solution, your code can be translated into a loop, or `flatten` could be re-written to return the `last` cell as well. – Will Ness May 08 '12 at 08:06
  • http://pastebin.com/smPKTSQN I tested w/ CLISP 2.38. mapcan was the fastest, your code ("linear rec") was 2nd at 1.3x time, top-down loop at 1.4x, then nconcing at 1.6x, and appending was last, 2.5x slower. nconcing clearly does something better, running 1.5x faster than appending. Very interesting to see what it'll be on your end. What are you testing it on? Please test *this code* as is, so we can compare. btw "linear rec" and "appending" run at worse cmpxties than the other three for growing data size. – Will Ness May 08 '12 at 14:55