2

I am reading The Seasoned Schemer by Friedman and Felleisen, but I am a little uneasy with some of their best practices. In particular, the authors recommend:

  • using letrec to remove arguments that do not change for recursive applications;
  • using letrec to hide and to protect functions;
  • using letcc to return values abruptly and promptly.

Let's examine some consequences of these rules. Consider for example this code for computing the intersection of a list of lists:

#lang scheme

(define intersectall
  (lambda (lset)
    (let/cc hop
      (letrec
          [[A (lambda (lset)
                (cond [(null? (car lset)) (hop '())]
                      [(null? (cdr lset)) (car lset)]
                      [else (I (car lset) (A (cdr lset)))]))]
           [I (lambda (s1 s2)
                (letrec
                    [[J (lambda (s1)
                          (cond [(null? s1) '()]
                                [(M? (car s1) s2) (cons (car s1) (J (cdr s1)))]
                                [else (J (cdr s1))]))]
                     [M? (lambda (el s)
                           (letrec
                               [[N? (lambda (s)
                                      (cond [(null? s) #f]
                                            [else (or (eq? (car s) el) (N? (cdr s)))]))]]
                             (N? s)))]]
                  (cond [(null? s2) (hop '())]
                        [else (J s1)])))]]
        (cond [(null? lset) '()]
              [else (A lset)])))))

This example appears in Chapter 13 (not exactly like this: I glued the membership testing code that is defined separately in a previous paragraph).

I think the following alternative implementation, which makes very limited use of letrec and letcc is much more readable and simpler to understand:

(define intersectall-naive
  (lambda (lset)
    (letrec
        [[IA (lambda (lset)
              (cond [(null? (car lset)) '()]
                    [(null? (cdr lset)) (car lset)]
                    [else (intersect (car lset) (IA (cdr lset)))]))]
         [intersect (lambda (s1 s2)
                      (cond [(null? s1) '()]
                            [(M? (car s1) s2) (cons (car s1) (intersect (cdr s1) s2))]
                            [else (intersect (cdr s1) s2)]))]
         [M? (lambda (el s)
               (cond [(null? s) #f]
                     [else (or (eq? (car s) el) (M? el (cdr s)))]))]]
    (cond [(null? lset) '()]
          [else (IA lset)]))))

I am new to scheme and my background is not in Computer Science, but it strikes me that we have to end up with such complex code for a simple list intersection problem. It makes me wonder how people manage the complexity of real-world applications. Are experienced schemers spending their days deeply nesting letcc and letrec expressions?

This was the motivation for asking stackexchange.

My question is: are Friedman and Felleisen overly complicating this example for education's sake, or should I just get accustomed to scheme code full of letccs and letrecs for performance reasons? Is my naive code going to be impractically slow for large lists?

J. D.
  • 279
  • 1
  • 9
  • 1
    they could at least use some `let` bindings to avoid all those repetitious `car` and `cdr` calls which make the code *completely unreadable*. good code should be visually apparent. that surface syntax is terrible, forces you to think about mundane stuff instead of being able to simply see it. it is obfuscation, is what it is. with good syntax (even if it's [just some made-up pseudocode](https://imgur.com/a/SwVDhPv)) it is easy to understand, and even is easy to change into not needing any continuation capturing, which *is* a performance killer, actually. – Will Ness Oct 18 '20 at 19:27
  • Hi @WillNess, your pseudocode looks very good, which language did inspire you? Looking at your solution made me think that probably a real-world code would make extensive use of pattern matching in this situation. Unfortunately *The Seasoned Schemer* does not introduce pattern matching (at least not yet, I'm halfway through). I am afraid using `let` bindings to avoid `car` and `cdr` calls would not save a lot of space, because `let` evaluates before binding, and I must make sure that the list actually has a `cdr` before asking for its `cdr` and binding it to a variable. – J. D. Oct 20 '20 at 13:18
  • partly Haskell, partly my [evolving](https://stackoverflow.com/search?q=user%3A849891+%22pseudocode%22+is%3Aa+%5Bscheme%5D) [imagination](https://stackoverflow.com/a/34910496/). I even thought up the `[a, ...d...]` syntax at first, then it turned out JS had `[a, ...d]` (for ages ?) already. (but I don't know JS. there's a log list of languages I don't know, python, clojure, ruby, etc.). yes, premature `cdr` is a problem, but with pseudocode, we get to wish it away, saying "it's a definition, not an instruction for immediate action". – Will Ness Oct 20 '20 at 15:17
  • (contd.) I think `{x => x}` syntax for lambdas is also in JS. (in Haskell, `let/where` bindings are lazy, aren't performed until the variable is *really* accessed). a code-analyzer could transform the pseudocode into the "real Scheme"... etc. --- BTW there's no problem with premature `car`/`cdr` in my pseudocode, I think, if the equations are tried one by one in top-down order). we have a separate first equation `intersectall [] = []`, so `lset` in the second one is guaranteed to be non-empty. that's assuming we only perform *one*, 1st equation that matched. unlike Prolog, which does them all. – Will Ness Oct 20 '20 at 15:30

1 Answers1

1

I'm not an expert on Scheme implementations, but I have some ideas about what's going on here. One advantage the authors have via their let/cc that you don't have is early termination when it's clear what the entire result will be. Suppose someone evaluates

(intersectall-naive (list big-list
                          huge-list
                          enormous-list
                          gigantic-list
                          '()))

Your IA will transform this to

(intersect big-list
           (intersect huge-list
                      (intersect enormous-list
                                 (intersect gigantic-list
                                            '()))))

which is reasonable enough. The innermost intersection will be computed first, and since gigantic-list is not nil, it will traverse the entirety of gigantic-list, for each item checking whether that item is a member of '(). None are, of course, so this results in '(), but you did have to walk the entire input to find that out. This process will repeat at each nested intersect call: your inner procedures have no way to signal "It's hopeless, just give up", because they communicate only by their return value.

Of course you can solve this without let/cc, by checking the return value of each intersect call for nullness before continuing. But (a) it is rather pretty to have this check only occur in one direction instead of both, and (b) not all problems will be so amenable: maybe you want to return something where you can't signal so easily that early exit is desired. The let/cc approach is general, and allows early-exit in any context.

As for using letrec to avoid repetition of constant arguments to recursive calls: again, I am not an expert on Scheme implementations, but in Haskell I have heard the guidance that if you are closing over only 1 parameter it's a wash, and for 2+ parameters it improves performance. This makes sense to me given how closures are stored. But I doubt it is "critical" in any sense unless you have a large number of arguments or your recursive functions do very little work: argument handling will be a small portion of the work done. I would not be surprised to find that the authors think this improves clarity, rather than doing it for performance reasons. If I see

(define (f a x y z)
  (define (g n p q r) ...)
  (g (g (g (g a x y z) x y z) x y z) x y z))

I will be rather less happy than if I see

(define (f a x y z)
  (define (g n) ...)
  (g (g (g (g a)))))

because I have to discover that actually p is just another name for x, etc., check that the same x, y, and z are used in each case, and confirm that this is on purpose. In the latter case, it is obvious that x continues to have that meaning throughout, because there is no other variable holding that value. Of course this is a simplified example and I would not be thrilled to see four literal applications of g regardless, but the same idea holds for recursive functions.

amalloy
  • 89,153
  • 8
  • 140
  • 205
  • Hi amalloy, thanks for the answer. I agree that in general passing around fewer arguments makes a function call more clear, but in this specific case wouldn't it be much more clear to just have the membership testing function `M?` instead of both `M?` and `N?`, and to just have `I` instead of both `I` and `J`? I think I was just hoping that the compiler/interpreter could optimise these details without requiring us to be so careful about nesting multiple `letcc` and `letrec` blocks. Am I asking too much? – J. D. Oct 14 '20 at 09:19