Reserved keywords in programming language

Question

I am wondering if all programming language reserve keys words? SaysIf,While are reserved key words. We should not use it as ordinary variable or function name say if I have If = 3 is illegal. So compiler will generate error during sanner phase. What if a language allow programmer use reserved keywords say If as variable name or function name. How do the compiler can handle this? Does this get handled in the scanner or parser? What should semantic analysis do?

update: I understand this is not a good practice but the real reason for most/all programming language not support this is because scanner or parser cannot do acurately scanning the language or parsing the language OR what it is really behind scenes? Thanks.

Such, erm, ability, will not only "confuse" complier, but people, who will read code in that language. Then why allowing this after all? — kirilloid, Apr 15 '12 at 23:54

blackcompe · Accepted Answer · 2012-04-16T01:34:48.403

You definitely could do such a thing, but obviously it would destroy the intuitiveness of the source code. Imagine this:

if if == 1

As far as actually implementing it, the lexer wouldn't have to be changed at all. If the lexer matches "if" in the source it returns a token with an IF type. Suppose we have the following assignment statement, where if is a variable name and it's getting assigned the value 1.

if <- 1;

The lexer's token stream to be fed to the parser is:

IF, LARROW, INTLITERAL, SEMICOLON

I might have the following productions to describe an assignment statement (\w integer rvals):

assignStmt::= id:i LARROW intExpr:e SEMICOLON {: RESULT = new AssignmentStatement(i, e) :}
intExpr::= INTLITERAL:i {: RESULT = i.intVal; :}
id::= ID:i {: RESULT = i.strVal; :}

LARROW, ID, IF, INTLITERAL, and SEMICOLON are terminals, which are tokens returned by the lexer, and assignStmt, id, and intExpr are non-terminals. ID represents an identifier (e.g. class/variable/method name).

After failing the production for an if statement, we'll eventually enter the first production for an assignment statement. We expand the id non-terminal, whose only production is ID, but the token I want to match is IF, so the assignStmt production fails altogether.

For my language to allow a variable to be named "if" all I have to do is:

assignStmt::= id:i LARROW intExpr:e SEMICOLON {: RESULT = new AssignmentStatement(i, e) :}
intExpr::= INTLITERAL:i {: RESULT = i.intVal; :}
id::= ID:i {: RESULT = i.strVal; :}
     |IF {: RESULT = "if"; :}

Note that | defines an alternate production for the non-terminal. Now we have that second production for the id non-terminal, which matches the current token, and ultimately results in matching an assignment statement.

AssignmentStatement is an AST node defined as follows:

class AssignmentStatement {
     String varName;
     int intVal;
     AssignmentStatement(String s, int i){varName = s; intVal = i; }
}

Once the parser decides the source is syntactically correct, nothing else should be affected. The names of your variables shouldn't affect the latter stages of compilation, that is if you don't create conditions that would allow that to happen.

score 1 · Answer 2 · answered Apr 15 '12 at 23:57

Why on earth would you want to do it even if you could?

All it can do is make for unmaintainable code.

if (a==b) - is that an if expression or a call to the function if passing a boolean arg ?

I'd say if any language did let you do it, it would probably be some weird academic thing with 3 users.

[putting on asbestos underware in preparation for merciless flaming from the 3 users ;-)]

score 1 · Answer 3 · answered Apr 16 '12 at 00:29

1

Programming languages tend to have reserved words because people like to put lexical scanners in front of the parser. A lexical scanner will turn the source code into a series of tokens, so you may end up with a ">>" token and say all such tokens are shift operators, and then you cannot use the characters for anything else except as part of other tokens (like a quoted string), which is, or used to be, a popular problem with C++. Other words like "if" are the same, that is turned into some kind of "if" token and whenever the parser sees the "if" token, it will treat it as the first part of some conditional construct. Another example would be JavaScript where you can write

JSON.stringify({bar:2})

but you cannot write

JSON.stringify({var:2})

Because "var" is a "var" token, but "bar" is just an identifier like any other.

answered Apr 16 '12 at 00:29

name

11
1

as blackcompe's answer described, if allow `id->IF`, then we can treat it as just identifier. – Simon Guo Apr 16 '12 at 03:50
That postpones the same problem from lexer to the parser, needing another layer there to fix it. Also keep in mind that you don't just have to accept valid code. You must also be able to output sane errormessages in nearly the correct spot if the program is wrong. Ambiguous syntax makes that a lot harder. – Marco van de Voort Apr 16 '12 at 10:56
Starting with ES5, reserved words in javascript may be used for object property names without the need to quote them in brackets https://stackoverflow.com/a/40210179 `JSON.stringify({var: "abc", class: 123, let: 456, const: "def", import: 789})` is now valid. – Robin A. Meade Sep 01 '22 at 02:43

score 0 · Answer 4 · answered Apr 15 '12 at 23:56

Well, I cannot think of any compiled languages without reserved keywords; it's simply much more convinnient and there are rarely good reasons to use those reserved keywords ('if' is not a good variable name).

In PHP, variables are used starting with a dollar sign, so I suppose a language could implement it that way (using a non-letter to prefix a variable so you could have $if). I suppose that could be made to work, although again there is not much use to do so.

score 0 · Answer 5 · answered Apr 16 '12 at 08:36

One way to allow arbitrary keywords would be to use non-alphabetic symbols for all non-identifier syntactic variables. APL takes this approach, and arguably so does Smalltalk (in Smalltalk-80, there are six reserved words, but they all have variable-like semantics; things that would normally be keywords, like conditions, are syntactically regular messages).

score -1 · Answer 6 · answered Apr 16 '12 at 00:02

I don't think any such language exists. All informatic languages are based on a grammar, that is, a set of rules saying how the code must be constructed. That way, you can prove that a code is structurally valid. If you were to allow switching names as you want, you would have to have a way to change the grammar "on the fly", to ensure that the verification of the code stays right.

On a more practical level, why bother doing such a thing ? What's so wrong with reserved keywords? They are really useful, at least everyone speaks the same language in a same way. You would'nt even think of such a thing with a real world language... Imagine if you started switching words meaning around! Nobody would understand anything anymore!!

Reserved keywords in programming language

6 Answers6