2

Footnote #28 of SICP says the following:

Embedded definitions must come first in a procedure body. The management is not responsible for the consequences of running programs that intertwine definition and use.

What exactly does this mean? I understand:

  • "definitions" to be referring to both procedure creations and value assignments.
  • "a procedure body" to be as defined in section 1.1.4. It's the code that comes after the list of parameters in a procedure's definition.
  • "Embedded definitions must come first in a procedure body" to mean 'any definitions that are made and used in a procedure's body must come before anything else'.
  • "intertwine definition and use" to mean 'calling a procedure or assigned value before it has been defined;'

However, this understanding seems to contradict the answers to this question, which has answers that I can summarise as 'the error that your quote is referring to is about using definitions at the start of a procedure's body that rely on definitions that are also at the start of that body'. This has me triply confused:

  1. That interpretation clearly contradicts what I've said above, but seems to have strong evidence - compiler rules - behind it.
  2. SICP seems happy to put definition in a body with other definitions that use them. Just look at the sqrt procedure just above the footnote!
  3. At a glance, it looks to me that the linked question's author's real error was treating num-prod like a value in their definition of num rather than as a procedure. However, the author clearly got it working, so I'm probably wrong.

So what's really going on? Where is the misunderstanding?

J. Mini
  • 1,868
  • 1
  • 9
  • 38
  • see [this comment](https://stackoverflow.com/questions/43403098/scheme-racket-undefined-function-cannot-use-before-initialization#comment113497324_43403098) – Will Ness Oct 03 '20 at 10:14

3 Answers3

3

In a given procedure's definition / code,

  • "internal definitions" are forms starting with define.
  • "a procedure body" is all the other forms after the forms which start with define.
  • "embedded definitions must come first in a procedure body" means all the internal define forms must go first, then all the other internal forms. Once a non-define internal form appears, there can not appear an internal define form after that.
  • "no intertwined use" means no use of any name before it's defined. Imagine all internal define forms are gathered into one equivalent letrec, and follow its rules.

Namely, we have,

(define (proc args ...)
   ;; internal, or "embedded", definitions
   (define a1 ...init1...)
   (define a2 ...init2...)
   ......
   (define an ...initn...)
   ;; procedure body
   exp1 exp2 .... )

Any ai can be used in any of the initj expressions, but only inside a lambda expression.(*) Otherwise it would refer to the value of ai while aj is being defined, and that is forbidden, because any ai names are considered not yet defined while any of the initj expressions are being evaluated.


(*) Remember that (define (foo x) ...x...) is the same as (define foo (lambda (x) ...x...)). That's why the definitions in that sqrt procedure in the book you mention are OK -- they are all lambda expressions, and any name's use inside a lambda expression will only actually refer to that name's value when the lambda expression's value -- a lambda function -- will be called, not when that lambda expression is being evaluated, producing that lambda function which is its value.


The book is a bit vague at first with the precise semantics of its language but in Scheme the above code is equivalent to

(define proc 
   (lambda (args ...)
      ;; internal, or "embedded", definitions
      (letrec ( (a1 ...init1...)
                (a2 ...init2...)
                ......
                (an ...initn...) )
        ;; procedure body
        exp1 exp2 .... 
        )))

as can be seen explained, for instance, here, here or here.


For example,

                                   ;; or, equivalently,
(define (my-proc x)                (define my-proc
                                     (lambda (x)
   (define (foo) a)                     (letrec ( (foo (lambda () a))
   (define a x)                                   (a x) )
   ;; my-proc's body                       ;; letrec's body
   (foo))                                  (foo))))

first evaluates the lambda expression, (lambda () a), and binds the name foo to the result, a function; that function will refer to the value of a when (foo) will be called, so it's OK that there's a reference to a inside that lambda expression -- because when that lambda expression is evaluated, no value of a is immediately needed, just the reference to its future value, under the name a, is present there; i.e. the value that a will have after all the names in that letrec are initialized, and the body of letrec is entered. Or in other words, when all the internal defines are completed and the body proper of the procedure my-proc is entered.

So we see that foo is defined, but is not used during the initializations; a is defined but is not used during the initializations; thus all is well. But if we had e.g.

(define (my-proc x)
   (define (foo) x)    ; `foo` is defined as a function
   (define a (foo))    ; `foo` is used by being called as a function
   a)

then here foo is both defined and used during the initializations of the internal, or "embedded", defines; this is forbidden in Scheme. This is what the book is warning about: the internal definitions are only allowed to define stuff, but its use should be delayed for later, when we're done with the internal defines and enter the full procedure's body.

Will Ness
  • 70,110
  • 9
  • 98
  • 181
  • So there's no guarantee of the order that function definitions are parsed, but function definitions inside function definitions will always be parsed later than the function definitions that aren't inside other function definitions? – J. Mini Oct 05 '20 at 19:45
  • not "parsed", but performed. the body of a function is parsed to define that function. but the code inside the body of that function will only be executed when that function is called. --- (do consult the meaning of `let/`letrec`. there are lots of questions and answers here on SO, including [by me](https://stackoverflow.com/search?q=user%3A849891+%5Bscheme%5D+let+letrec) ([e.g.](https://stackoverflow.com/a/15006018/849891). there is no order there, they are all defined simultaneously, conceptually speaking.) – Will Ness Oct 06 '20 at 05:30
  • also, the book it being intentionally vague at first, it wants to present the natural development of "a language". it only much later gives it a strict and firm interpretation by presenting the interpreter for it. several interpreters, in fact, each for a different meaning / semantics of "a language". – Will Ness Oct 06 '20 at 05:34
  • If this all comes down to lambda expressions, then what went wrong here stackoverflow.com/q/43403098/10319707 ? You've commented on that question, but I'm unsure about how your comment fits in to this answer. I suspect that I've got lost at some point in the distinction between ```define``` for procedures and ```define``` for value assignment. – J. Mini Oct 08 '20 at 18:57
  • I think that the sentence "Any ```ai``` can be used in any of the ```initj``` expressions, but only inside a lambda expression" has me confused. How could I possibly put anything in ```initj```without it being in a lambda expression? To my understanding, everything in ```initj``` is in a lambda expression. – J. Mini Oct 08 '20 at 19:12
  • there is no distinction b/w define and define. define is define. value is value. functions are values too. in Scheme these are not assignments but bindings. a name gets bound to a value. a value is produced by evaluation of expression. a lambda expression's value is a function. if some name is mentioned inside that lambda expression then its value is not needed when that lambda expression is evaluated. if some name is mentioned in an expression (not inside lambda) then its value is needed when that expressions is evaluated. please read up on the basics of Scheme evaluation, let, and letrec. – Will Ness Oct 08 '20 at 21:39
  • *"How could I possibly put anything in `initj` without it being in a lambda expression?"* simple, as in the linked question: `(define (my-proc x) (define a 1) (define b (+ a 2)) x)`. – Will Ness Oct 08 '20 at 21:51
  • so `(define (my-proc x) (define a x) (define foo (lambda () a)) (foo))` first evaluates the lambda expression, `(lambda () a)`, and binds the name `foo` to the result, a function; that function ***will*** refer to the value of `a`, when `(foo)` ***will*** be called, so it's ok that there's a reference to `a` ***inside that lambda expression*** -- because when that lambda expression is evaluated no value of `a` is immediately needed, just the reference to its future value, under the name `a`, is present there. – Will Ness Oct 09 '20 at 11:23
  • and then, after all the internal `define` forms, come the forms that will be executed when `my-proc` is called, as e.g. `(my-proc 7)`. and so, `(foo)` is called, which is the same as calling `((lambda () a))`, which is the same as `a`, which is the same as `x`, because of the internal definition `(define a x)`. which is the same as `7`, because we called `my-proc` with `7` as an argument, `(my-proc 7)`. – Will Ness Oct 09 '20 at 11:28
  • You might want to edit your answer to include that. The final block of code in your answer shows every ```initj``` as being inside the parent lambda expression that's on the second line, so you can see why I'd wrongly think "then isn't every ```intj``` inside a lambda expression?". – J. Mini Oct 09 '20 at 11:55
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/222792/discussion-between-will-ness-and-j-mini). – Will Ness Oct 09 '20 at 15:59
2

This is subtle, and as the footnote and the question you reference imply, the subtlties can vary depending on a particular language's implementation.

These issues will be covered in far more detail later in the book (Chapters 3 and 4) and, generally, the text avoids using internal definitions so that these issues can be avoided until they are examined in detail.

A key difference between between the code above the footnote:

(define (sqrt x)
  (define (good-enough? guess)
    (< (abs (- (square guess) x)) 0.001))
  (define (improve guess)
    (average guess (/ x guess)))
  (define (sqrt-iter guess)
    (if (good-enough? guess)
        guess
        (sqrt-iter (improve guess))))
  (sqrt-iter 1.0))

and the code in the other question:

(define (pi-approx n)
  (define (square x) (* x x))

  (define (num-prod ind) (* (* 2 ind) (* 2 (+ ind 1)))) ; calculates the product in the numerator for a certain term
  (define (denom-prod ind) (square (+ (* ind 2 ) 1))) ;Denominator product at index ind

  (define num (product num-prod 1 inc n))
  (define denom (product denom-prod 1 inc n))

is that all the defintions in former are procedure definitions, whereas num and denom are value defintions. The body of a procedure is not evaluated until that procedures is called. But a value definition is evaluated when the value is assigned.

With a value definiton:

(define sum (add 2 2))

(add 2 2) will evaluated when the definition is evaluated, if so add must already be defined. But with a procedure definition:

(define (sum n m) (add n m))

a procedure object will be assigned to sum but the procedure body is not yet evaluated so add does not need be defined when sum is defined, but must be by the time sum is called:

(sum 2 2)

As I say there is a lot sublty and a lot of variation so I'm not sure if the following is always true for every variation of scheme, but within 'SICP scheme' you can say..

Valid (order of evaluation of defines not significant):

;procedure body
(define (sum m n) (add m n))
(define (add m n) (+ m n))
(sum 2 2)

Also valid:

;procedure body
(define (sum) (add 2 2))
(define (add m n) (+ m n))
(sum)

Usually invalid (order of evaluation of defines is significant):

;procedure body
(define sum (add 2 2))
(define (add m n) (+ m n))

Whether the following is valid depends on the implementation:

;procedure body
(define (add m n) (+ m n))
(define sum (add 2 2))

And finally an example of intertwining definition and use, whether this works also depends on the implementation. IIRC, this would work with Scheme described in Chapter 4 of the book if the scanning out has been implemented.

;procedure body
(sum)
(define (sum) (add 2 2))
(define (add m n) (+ m n))

It is complicated and subtle, but the key points are:

  1. value definitions are evaluated differently from procedure definitions,
  2. behaviour inside blocks may be different from outside blocks,
  3. there is variation between scheme implementations,
  4. the book is designed so you don't have to worry much about this until Chapter 3,
  5. the book will cover this in detail in Chapter 4.
codybartfast
  • 7,323
  • 3
  • 21
  • 25
  • I think that this is very close to an answer. All that is missing is spelling out exactly what the quote means (rather than just giving an example) and maybe a short line that leaves no doubt about where the linked question's code failed. You've done the background work excellently; It just needs that killing blow. – J. Mini Oct 01 '20 at 12:08
0

You discovered one of the difficulties of scheme. And of lisp. Due to this kind of chicken and egg issue there is no single lisp, but lots of lisps appeared.

If the binding forms are not present in code in a letrec-logic in R5RS and letrec*-logic in R6RS, the semantics is undefined. Which means, everything depend on the will of the implementor of scheme.

See the paper Fixing Letrec: A Faithful Yet Efficient Implementation of Scheme’s Recursive Binding Construct.

Also, you can read the discussions from the mailing list from 1986, when there was no general consensus among different implementors of scheme.

Also, at MIT there were developed 2 versions of scheme -- the student version and the researchers' development version, and they behaved differently concerning the order of define forms.

alinsoar
  • 15,386
  • 4
  • 57
  • 74
  • You say "You discovered one of the difficulties of scheme." but you do not name it? is it implementation-defined behaviour? – coredump Oct 01 '20 at 11:04
  • @coredump the order of forms of type definition vs. expression, as you can see inside the OP's question. and inside my answer. – alinsoar Oct 01 '20 at 11:18
  • But then I fail to see the relationship with lisp, which does not have the same problem (?) – coredump Oct 01 '20 at 11:34
  • @coredump I cannot help you, sorry. I saw other lisps with other rules of bindings. I bow out of this discussion. – alinsoar Oct 01 '20 at 11:36
  • @coredump: I don't think this is a problem in fact: rather it is a matter of how different languages are defined. CL, for instance I think is defined to 'resolve' things like this in a way which is unambiguous (I believe) so all implementations should do the same thing (this is helped for CL by it being a Lisp-2). It may be that the Scheme standard does not completely specify behaviour here so different implementations may legitimately vary. In either case it's not a problem, it's just how much variation language standards allow. –  Oct 01 '20 at 11:38
  • @tfb I agree this is not really a problem, it is a matter of how precise the spec is; it starts being a portability problem when users start relying too much on one implementation – coredump Oct 01 '20 at 11:44
  • @tfb internal `defines` is just meaningless "convenience" sugar, can and should be easily replaced with `letrec`, which does have firm specification. the SICP book been written in the olden days, the switch to lexical scope was still a recent thing. they treat `define` as an instruction to be carried out, not a keyword in a program's text. very Lisp-y like. :) – Will Ness Oct 03 '20 at 10:07
  • @coredump (see above comment). it's not a problem. internal defines shouldn't be used. letrec should be used instead. – Will Ness Oct 03 '20 at 10:07
  • @WillNess: I realise internal `define` is sugar on `letrec` but it's useful sugar for me, because it reduces indentation. Indeed I wish CL had similar sugar standardly. After all, `letrec` is just a convenience on Y. –  Oct 03 '20 at 18:25
  • @tfb sugar is bad for you. :) (kidding) as the problem in the linked question shows, it *is* convenient, but [adds confusion](https://stackoverflow.com/questions/43403098/scheme-racket-undefined-function-cannot-use-before-initialization#comment113497324_43403098). **Y** traditionally only *emulates* self-reference, through replication; `letrec` is free to - and does - create actual self-reference. also, mutual recursion with bare Y is convoluted, but trivial with `letrec`. increased indentation sometimes increases readability. but, in matters of taste...... YMMV. :) – Will Ness Oct 03 '20 at 18:54
  • @tfb if the internal defines does not follow the letrec syntax it is undefined behaviour. In the general case you arrive at chicken and egg problem. – alinsoar Oct 03 '20 at 20:35