6

contextualization: I've been doing a university project in which I have to write a parser for regular expressions and build the corresponding epsilon-NFA. I have to do this in Prolog and Lisp. I don't know if questions like this are allowed, if not I apologize.

I heard some of my classmates talking about how they used the function gensym for that, I asked them what it did and even checked up online but I literally can't understand what this function does neither why or when is best to use it.

In particular, I'm more intrested in what it does in Lisp. Thank you all.

false
  • 10,264
  • 13
  • 101
  • 209
zeta.exe
  • 61
  • 1
  • 5

5 Answers5

15

GENSYM creates unique symbols. Each call creates a new symbol. The symbol usually has a name which includes a number, which is counted up. The name is also unique (the symbol itself is already unique) with a number, so that a human reader can identify different uninterned symbols in the source code.

CL-USER 39 > (gensym)
#:G1083

CL-USER 40 > (gensym)
#:G1084

CL-USER 41 > (gensym)
#:G1085

CL-USER 42 > (gensym)
#:G1086

gensym is often used in Lisp macros for code generation, when the macro needs to create new identifiers, which then don't clash with existing identifiers.

Example: we are going to double the result of a Lisp form and we are making sure that the Lisp form itself will be computed only once. We do that by saving the value in a local variable. The identifier for the local variable will be computed by gensym.

CL-USER 43 > (defmacro double-it (it)
               (let ((new-identifier (gensym)))
                 `(let ((,new-identifier ,it))
                    (+ ,new-identifier ,new-identifier))))
DOUBLE-IT

CL-USER 44 > (macroexpand-1 '(double-it (cos 1.4)))
(LET ((#:G1091 (COS 1.4)))
  (+ #:G1091 #:G1091))
T

CL-USER 45 > (double-it (cos 1.4))
0.33993432
Rainer Joswig
  • 136,269
  • 10
  • 221
  • 346
10

a little clarification of the existing answers (as the op is not yet aware of the typical common lisp macros workflow):

consider the macro double-it, proposed by mr. Joswig. Why would we bother creating this whole bunch of let? when it can be simply:

(defmacro double-it (it)
  `(+ ,it ,it))

and ok, it seems to be working:

CL-USER> (double-it 1)
;;=> 2

but look at this, we want to increment x and double it

CL-USER> (let ((x 1))
           (double-it (incf x)))
;;=> 5 
;; WHAT? it should be 4!

the reason can be seen in macro expansion:

(let ((x 1))
  (+ (setq x (+ 1 x)) (setq x (+ 1 x))))

you see, as the macro doesn't evaluate form, just splices it into generated code, it leads to incf being executed twice.

the simple solution is to bind it somewhere, and then double the result:

(defmacro double-it (it)
  `(let ((x ,it))
     (+ x x)))

CL-USER> (let ((x 1))
           (double-it (incf x)))
;;=> 4
;; NICE!

it seems to be ok now. really it expands like this:

(let ((x 1))
  (let ((x (setq x (+ 1 x))))
    (+ x x)))

ok, so what about the gensym thing?

let's say, you want to print some message, before doubling your value:

(defmacro double-it (it)
  `(let* ((v "DOUBLING IT")
          (val ,it))
     (princ v)
     (+ val val)))

CL-USER> (let ((x 1))
           (double-it (incf x)))
;;=> DOUBLING IT
;;=> 4
;; still ok!

but what if you accidentally name value v instead of x:

CL-USER> (let ((v 1))
           (double-it (incf v)))

;;Value of V in (+ 1 V) is "DOUBLING IT", not a NUMBER.
;; [Condition of type SIMPLE-TYPE-ERROR]

It throws this weird error! Look at the expansion:

(let ((v 1))
  (let* ((v "DOUBLING IT") (val (setq v (+ 1 v))))
    (princ v)
    (+ val val)))

it shadows the v from the outer scope with string, and when you are trying to add 1, well it obviously can't. Too bad.

another example, say you want to call the function twice, and return 2 results as a list:

(defmacro two-funcalls (f v)
  `(let ((x ,f))
     (list (funcall x ,v) (funcall x ,v))))

CL-USER> (let ((y 10))
           (two-funcalls (lambda (z) z) y))
;;=> (10 10)
;; OK

CL-USER> (let ((x 10))
           (two-funcalls (lambda (z) z) x))

;; (#<FUNCTION (LAMBDA (Z)) {52D2D4AB}> #<FUNCTION (LAMBDA (Z)) {52D2D4AB}>)
;; NOT OK!

this class of bugs is very nasty, since you can't easily say what's happened. What is the solution? Obviously not to name the value v inside macro. You need to generate some sophisticated name that no one would reproduce in their code, like my-super-unique-value-identifier-2019-12-27. This would probably save you, but still you can't really be sure. That's why gensym is there:

(defmacro two-funcalls (f v)
  (let ((fname (gensym)))
    `(let ((,fname ,f))
       (list (funcall ,fname ,v) (funcall ,fname ,v)))))

expanding to:

(let ((y 10))
  (let ((#:g654 (lambda (z) z)))
    (list (funcall #:g654 y) (funcall #:g654 y))))

you just generate the var name for the generated code, it is guaranteed to be unique (meaning no two gensym calls would generate the same name for the runtime session),

(loop repeat 3 collect (gensym))
;;=> (#:G645 #:G646 #:G647)

it still can potentially be clashed with user var somehow, but everybody knows about the naming and doesn't call the var #:GXXXX, so you can consider it to be impossible. You can further secure it, adding prefix

(loop repeat 3 collect (gensym "MY_GUID"))
;;=> (#:MY_GUID651 #:MY_GUID652 #:MY_GUID653)
leetwinski
  • 17,408
  • 2
  • 18
  • 42
  • 2
    those symbols are *uninterned*. try `(let ((#:G1 1)) #:G1)`. – Will Ness Dec 27 '19 at 18:54
  • 1
    This was a great explanation! Thank you! :D – kaeland Sep 18 '22 at 05:24
  • I reckon that it won't work without the comma's in front of the fname in this example. But symbols are self evaluating thus comma's in front of a symbol in a quasiquote makes no sense. This can't be the whole story. It works, but an explanation why this unquoting is needed should be nice. – Albert van der Horst Aug 29 '23 at 10:00
3

GENSYM will generate a new symbol at each call. It will be garanteed, that the symbol did not exist before it will be generated and that it will never be generated again. You may specify a symbols prefix, if you like:

CL-USER> (gensym)
#:G736
CL-USER> (gensym "SOMETHING")
#:SOMETHING737

The most common use of GENSYM is generating names for items to avoid name clashes in macro expansion.

Another common purpose is the generaton of symbols for the construction of graphs, if the only thing demand you have is to attach a property list to them, while the name of the node is not of interest.

I think, the task of NFA-generation could make good use of the second purpose.

Patrick
  • 508
  • 2
  • 9
0

This is a note to some of the other answers, which I think are fine. While gensym is the traditional way of making new symbols, in fact there is another way which works perfectly well and is often better I find: make-symbol:

make-symbol creates and returns a fresh, uninterned symbol whose name is the given name. The new-symbol is neither bound nor fbound and has a null property list.

So, the nice thing about make-symbol is it makes a symbol with the name you asked for, exactly, without any weird numerical suffix. This can be helpful when writing macros because it makes the macroexpansion more readable. Consider this simple list-collection macro:

(defmacro collecting (&body forms)
  (let ((resultsn (make-symbol "RESULTS"))
        (rtailn (make-symbol "RTAIL")))
    `(let ((,resultsn '())
           (,rtailn nil))
       (flet ((collect (it)
                (let ((new (list it)))
                  (if (null ,rtailn)
                      (setf ,resultsn new
                            ,rtailn new)
                    (setf (cdr ,rtailn) new
                          ,rtailn new)))
                it))
         ,@forms
         ,resultsn))))

This needs two bindings which the body can't refer to, for the results, and the last cons of the results. It also introduces a function in a way which is intentionally 'unhygienic': inside collecting, collect means 'collect something'.

So now

> (collecting (collect 1) (collect 2) 3)
(1 2)

as we want, and we can look at the macroexpansion to see that the introduced bindings have names which make some kind of sense:

> (macroexpand '(collecting (collect 1)))
(let ((#:results 'nil) (#:rtail nil))
  (flet ((collect (it)
           (let ((new (list it)))
             (if (null #:rtail)
                 (setf #:results new #:rtail new)
               (setf (cdr #:rtail) new #:rtail new)))
           it))
    (collect 1)
    #:results))
t

And we can persuade the Lisp printer to tell us that in fact all these uninterned symbols are the same:

> (let ((*print-circle* t))
    (pprint (macroexpand '(collecting (collect 1)))))

(let ((#2=#:results 'nil) (#1=#:rtail nil))
  (flet ((collect (it)
           (let ((new (list it)))
             (if (null #1#)
                 (setf #2# new #1# new)
               (setf (cdr #1#) new #1# new)))
           it))
    (collect 1)
    #2#))

So, for writing macros I generally find make-symbol more useful than gensym. For writing things where I just need a symbol as an object, such as naming a node in some structure, then gensym is probably more useful. Finally note that gensym can be implemented in terms of make-symbol:

(defun my-gensym (&optional (thing "G"))
  ;; I think this is GENSYM
  (check-type thing (or string (integer 0)))
  (let ((prefix (typecase thing
                  (string thing)
                  (t "G")))
        (count (typecase thing
                 ((integer 0) thing)
                 (t (prog1 *gensym-counter*
                      (incf *gensym-counter*))))))
        (make-symbol (format nil "~A~D" prefix count))))

(This may be buggy.)

  • 4
    An idea of GENSYM is that one can create uninterned symbols symbols with a prefix string, where I can still guess by the numeric suffix that the symbols are the same, without the need to turn on the circle print feature, which makes code less readable. – Rainer Joswig Dec 27 '19 at 22:41
  • @RainerJoswig: I find it just as easy to guess that `#:rtail` and `#:rtail` are the same as `#:g9123` and `#g9123`, and the former has the advantage, for me, that I can also guess what `#:rtail` is meant to mean. The case where this would be hard is really nested expansions of the same macro. –  Dec 28 '19 at 13:20
  • 2
    @RainerJoswig: of course, using `(gensym "RTAIL")` &c might be even better. –  Dec 28 '19 at 14:11
  • 1
    Right, like for example in LOOP in SBCL, the gensyms almost all have a speaking prefix. – Rainer Joswig Dec 28 '19 at 15:50
-1

I think the answers are not precise. (gensym) cannot evaluate to a new symbol, because usage as

( let (a (gensym)) ..)

would be syntactically incorrect. Instead it actually returns a function, where evaluating a yield a unique symbol (determined at calling (gensym) ) that is the same for each evaluation of a . So after this phrase `a is not actually a symbol although it is used as if, In fact the symbol that results is immediately used as if the unique symbol were substituted. It follows that this make only sense in macro manipulations where one is in fear of conflicting symbols.

  • This answer seems confused. [`gensym`](http://www.lispworks.com/documentation/HyperSpec/Body/f_gensym.htm) _does_ return a symbol, not a function: `(let ((a (gensym))) (type-of a))` --> `SYMBOL`. With `(let (a (gensym)) ;...)` the value slots of the _symbols_ `a` and `gensym` are both bound to `nil`, i.e., the function `gensym` is not called in this code. – ad absurdum Aug 30 '23 at 13:45