Lisp function: union

Question

I have a lisp homework I am having a hard time with it.

I have to write a function that perform a union operation. The function takes 2 inputs, either in the form of either atom or list and unions every element, preserving the order and stripping off all levels of parenthesis.

The output for the function:

(my-union 'a 'b)                         ;; (a b)
(my-union 'a '(b))                       ;; (a b)
(my-union '(a b) '(b c))                 ;; (a b c)
(my-union '(((a))) '(b(c((d e))a)))      ;; (a b c d e)

I am fairly new to lisp. Here is what I have written so far and it works only for the third example:

(defun new-union (a b)
 (if (not b)
      a
      (if (member (car b) a)
        (new-union a (cdr b))
        (new-union (append a (list (car b))) (cdr b)))))

Any help would be appreciated!

score 5 · Accepted Answer · answered Sep 30 '12 at 13:10

5

Since this is your first homework, and you are new to Lisp, here is a very simple top-down approach, not worrying about performance, and making good use of the tools CL offers:

In Common Lisp, there is already a function which removes duplicates: remove-duplicates. Using it with the :from-end keyword-argument will "preserve order". Now, imagine you had a function flatten, which flattens arbitrarily nested lists. Then the solution to your question would be:

(defun new-union (list1 list2)
  (remove-duplicates (flatten (list list1 list2)) :from-end t))

This is how I would approach the problem when no further restrictions are given, and there is no real reason to worry much about performance. Use as much of the present toolbox as possible and do not reinvent wheels unless necessary.

If you approach the problem like this, it boils down to writing the flatten function, which I will leave as an exercise for you. It is not too hard, one easy option is to write a recursive function, approaching the problem like this:

If the first element of the list to be flattened is itself a list, append the flattened first element to the flattened rest. If the first element is not a list, just prepend it to the flattened rest of the list. If the input is not a list at all, just return it.

That should be a nice exercise for you, and can be done in just a few lines of code.

(If you want to be very correct, use a helper function to do the work and check in the wrapping function whether the argument really is a list. Otherwise, flatten will work on atoms, too, which may or may not be a problem for you.)

Now, assuming you have written flatten:

> (defun new-union (list1 list2)
    (remove-duplicates (flatten (list list1 list2)) :from-end t))
NEW-UNION
> (new-union 'a 'b)
(A B)
> (new-union 'a '(b))
(A B)
> (new-union '(a b) '(b c))
(A B C)
> (new-union '(((a))) '(b (c ((d e)) a)))
(A B C D E)

answered Sep 30 '12 at 13:10

danlei

14,121
5
58
82

1

I think one _has_ to reinvent wheels at first, to get the feel for a language; to _recognize_ the standard functions instead of trying hard to learn their names and interfaces by heart without real understanding. Using the standard functions comes later. It's hard for me to form a correct jigsaw puzzle if all the pieces are like black boxes to me, doing their work by black magic. :) – Will Ness Sep 30 '12 at 14:53
1

Will, reinventing wheels makes for good exercise and has its place. However, starting with Lisp or similar languages, one important lesson is higher-level thinking, like learning that there are reusable, composable, general functions. The point of my answer was not only that there is `remove-duplicates` but also that the solution to the problem can be expressed as a combination of simple building blocks. Of course, reimplementing `remove-duplicates`, or going for a more performant solution is an option, but I don't think reimplementation should always or even mostly precede use. – danlei Sep 30 '12 at 16:52
Thanks a lot! Could you recommend a good book to understand what's really going on with lisp? – user1561949 Sep 30 '12 at 18:17
1

You're welcome. There are many good books. To absolute beginners, I usually recommend [Touretzky](http://www.cs.cmu.edu/~dst/LispBook/book.pdf). However, depending on your general programming experience, it might be a little too slow-paced. Also see an [older answer](http://stackoverflow.com/a/2247639/271324), where I list a few (free) alternatives. [Winston & Horn](http://www.amazon.com/Lisp-3rd-Edition-Patrick-Winston/dp/0201083191/) would also bee an option, but it's not online. Later/ad lib: [PAIP](http://www.amazon.com/Paradigms-Artificial-Intelligence-Programming-Studies/dp/1558601910/). – danlei Sep 30 '12 at 19:51
By the way: Even if Touretzky might be a little too slow for you, it has nice little exercises with solutions that should fit your needs. – danlei Sep 30 '12 at 19:57
I would also recommend the *first* Graham book, "ANSI Common Lisp". Clear writing, to the point. Also, " LISP 1.5 Programmer's Manual by John McCarthy". Seriously. :) – Will Ness Oct 02 '12 at 20:01
@danlei: I don't agree you can start reasoning at an higher level. There is no way you'll remember which functions are consing and which aren't, or what is the complexity of a function unless you can attach a mental working model to them. You cannot just add a consing=true flag or O(n^2) label to each function and memorize them, and you cannot consult documentation at every open parenthesis you type. OTOH if you don't know which is which your code will be horrible. – 6502 Oct 03 '12 at 19:41
Maybe you'd like to explain what's so horrible about the above code? Starting reasoning at a higher level is how it came to be. It seems to be readable, maintainable, and transparently correct. If it is a bottleneck, a more performant implementation can be written. There is merit in knowing about implementation and performance characteristics of one's tools, but this is not a necessary condition for using them at all. I never made an argument against learning about algorithms and data structures in general, but correctly solving problems is more important to me than imaginary consing flags. – danlei Oct 03 '12 at 21:11

score 3 · Answer 2 · edited May 23 '17 at 12:10

One way to approach this is to separate your concerns. One is flattening; another is duplicates-removing; yet another is result-building.

Starting with empty list as your result, proceed to add into it the elements of the first list, skipping such elements that are already in the result.

Then do the same with the second list's elements, adding them to the same result list.

(defun my-union (a b &aux (res (list 1)) (p res))
  (nadd-elts p a)
  (nadd-elts p b)
  (cdr res))

nadd-elts would add to the end of list, destructively updating its last cell (pointed to by p) using e.g. rplacd. An example is here.

To add elements, nadd-elts would emulate the flattening procedure, and add each leaf element into p after checking res for duplicates.

Working in functional style, without destructive update, the general approach stays the same: start with empty result list, add first list into it - without duplicates - then second.

(defun my-union (a b &aux res)
  (setq res (add-into res a))
  (setq res (add-into res b))
  res)

Now we're left with implementing the add-into function.

(defun add-into (res a &aux r1 r2)
  (cond
     ((atom a) .... )
     (T (setq r1 (add-into res (car a)))
        (setq r2 (............ (cdr a)))
        r2)))

The above can be re-written without the auxiliary variables and without set primitives. Try to find out how... OK here's what I meant by that:

(defun my-union (a b) (add-into NIL (cons a b)))

(defun add-into (res a)
  (cond
     ((atom a) .... )
     (T (add-into (add-into res (car a)) 
                  (cdr a)))))

I am pretty new to lisp.Started 2 weeks ago. The homework was supposed to be easy and I way over my head. what does `&aux (res (list 1)) (p res))` means. — user1561949, Sep 30 '12 at 08:22
If you're so new maybe destructive code is not the route for you to go. Were you given any guidance as to which direction to take? Are you supposed to code this in "functional" style perhaps? But general advice stands - start with empty result, add the first list into it - without duplicates - then second. — Will Ness, Sep 30 '12 at 08:26
`&aux` just declares internal variables to the function. One is `res`, its initial value is `(list 1)`. Another is `p`, equal to `res`. — Will Ness, Sep 30 '12 at 08:27
We weren't given any specific routes to go. I know our prof likes recursion, but he didn't specify anything since it's our very first assignment. — user1561949, Sep 30 '12 at 08:36

score 2 · Answer 3 · 2012-09-30T12:31:06.403

Unless you are not allowed to use hash table (for some reason I've encountered this as a requirement before), you could come up with an ordering function that will help you build the resulting set in the way, that you don't have to repeat the search over and over again.

Also, since nested lists are allowed your problem scales down to only removing duplicates in a tree (as you can simply append as many lists as you want before you start processing them.

Now, I'll try to show few examples of how you could do it:

;; Large difference between best and worst case.
;; Lists containing all the same items will be processed
;; in square time
(defun union-naive (list &rest lists)
  (when lists (setf list (append list lists)))
  (let (result)
    (labels ((%union-naive (tree)
               (if (consp tree)
                   (progn
                     (%union-naive (car tree))
                     (when (cdr tree) (%union-naive (cdr tree))))
                   (unless (member tree result)
                     (setq result (cons tree result))))))
      (%union-naive list) result)))

;; Perhaps the best solution, it is practically linear time
(defun union-hash (list &rest lists)
  (when lists (setf list (append list lists)))
  (let ((hash (make-hash-table)) result)
    (labels ((%union-hash (tree)
               (if (consp tree)
                   (progn
                     (%union-hash (car tree))
                     (when (cdr tree) (%union-hash (cdr tree))))
                   (setf (gethash tree hash) t))))
      (%union-hash list))
    (maphash
     #'(lambda (a b)
         (declare (ignore b))
         (push a result)) hash)
    result))

;; This will do the job in more time, then the
;; solution with the hash-map, but it requires
;; significantly less memory. Memory is, in fact
;; a more precious resource some times, but you
;; have to decide what algo to use based on the
;; data size
(defun union-flatten (list &rest lists)
  (when lists (setf list (append list lists)))
  (labels ((%flatten (tree)
             (if (consp tree)
                 (if (cdr tree)
                     (nconc (%flatten (car tree))
                            (%flatten (cdr tree)))
                     (%flatten (car tree)))
                 (list tree))))
    ;; the code below is trying to do something
    ;; that you could've done using 
    ;; (remove-duplicates (%flatten list))
    ;; however sorting and then removing duplicates
    ;; may prove to be more efficient
    (reduce
     #'(lambda (a b)
         (cond
           ((atom a) (list a))
           ((eql (car a) b) b)
           (t (cons b a))))
     (sort (%flatten list)
           #'(lambda (a b)
               (string< (symbol-name a)
                        (symbol-name b)))))))



(union-naive '(((a))) '(b(c((d e))a)))
(union-hash '(((a))) '(b(c((d e))a)))
(union-flatten '(((a))) '(b(c((d e))a)))

Notice that the function I've used to order elements is not universal, but you would probably be able to come up with an alternative function for any sort of data. Any fast hashing function in general would do, I've used this one for simplicity.

"nested lists are allowed..." do you mean *in the input*? This can be confusing. — Will Ness, Sep 30 '12 at 12:39

Lisp function: union

3 Answers3