2

R7RS-small says that all identifiers must be terminated by a delimiter, but at the same time it defines pretty elaborate rules for what can be in an identifier. So, which one is it?

Is an identifier supposed to start with an initial character and then continue until a delimiter, or does it start with an initial character and continue following the syntax defined in 7.1.1.

Here are a couple of obvious cases. Are these valid identifiers?

  • a#a
  • b,b
  • c'c
  • d[d]

If they are not supposed to be valid, what is the purpose of saying that an identifier must be terminated by a delimiter?

Flux
  • 9,805
  • 5
  • 46
  • 92
Tzvetan Mikov
  • 511
  • 3
  • 5
  • Perhaps the intent is that a valid identifier must be followed by a delimiter, or otherwise it is invalid. – Tzvetan Mikov Mar 08 '20 at 03:53
  • An idea to cope with scheme grammars is also to google for read-atom. This is a non-standardized lisp function that is used internally by implementations to read symbols, numbers and atoms more generally and after that it will try to classify via backtracking an atom in numbers, symbols, etc. – alinsoar Apr 07 '20 at 12:38
  • I also edited my answer. – alinsoar Apr 07 '20 at 12:44

1 Answers1

1

|..ident..| are delimiters for symbols in R7RS, to allow any character that you cannot insert in an old style symbol (| is the delimiter).

However, in R6RS the "official" grammar was incorrect, as it did not allow to define symbols such that 1+, which led all implementations define their own rules to overcome this illness of the official grammar.

Unless you need to read the source code of a given implementation and see how it defines the symbols, you should not care too much about these rules and use classical symbols.

In the section 7.1.1 you find the backus-naur form that defines the lexical structure of R7RS identifiers but I doubt the implementations follow it.

I quote from here

As with identifiers, different implementations of Scheme use slightly different rules, but it is always the case that a sequence of characters that contains no special characters and begins with a character that cannot begin a number is taken to be a symbol

In other words, an implementation will use a function like read-atom and after that it will classify an atom by backtracking with read-number and if number? fails it will be a symbol.

alinsoar
  • 15,386
  • 4
  • 57
  • 74
  • I would like to understand the intent of the spec, say if I wanted to make a fresh R7RS implementation. Why does it say that "Identifiers that do not begin with a vertical line are terminated by a ⟨delimiter⟩" and at the same time provide a different grammar for identifiers? Is the intent that a valid identifier must also be terminated by a delimiter in addition to what the grammar says? – Tzvetan Mikov Mar 11 '20 at 01:57
  • @TzvetanMikov the syntactic class is defined in 7.1.1, as I've already told-- it's made of whitespace, (, ), ", etc.... if you are not familiar with this notation from 7.1 then you should begin by playing with simple things, before trying r7rs implementation. try implementation of r4rs better. – alinsoar Mar 11 '20 at 02:11
  • I am quite familiar with the notation, thank you, but it is ambiguous. One should be able to derive the behavior only from the text without referring to existing implementations. If you follow the grammar, there is no need to terminate an identifier with a delimiter. Is the delimiter rule intended to be applied in addition to the grammar? Specifically, how are the examples I gave in my question intended to be parsed, like "a,a"? "," is not a delimiter, but it is not part of or either. – Tzvetan Mikov Mar 11 '20 at 04:55
  • @TzvetanMikov I looked now over the r7rs grammar. It has the same problem as r6rs grammar -- the `1+` is not a valid symbol neither in r7rs (because `1` does not belong to the class ``). But all implementations define their own rules such that some symbols to be valid. So for your particular symbols you need to consult the source code of the implementation that interests you. – alinsoar Mar 12 '20 at 08:31
  • @TzvetanMikov none of your examples are not valid identifiers in official grammar and if you see that in some implementation they are valid then the implementation does not follow the official definition, – alinsoar Mar 12 '20 at 08:40