114

According to the documentation of the keyword module, two new members have been added in Python 3.9:

  • issoftkeyword
  • softkwlist

However their documentation doesn't reveal anything about their purpose. This change is not even mentioned in the What's New article, where typically all API changes are documented. Digging a little further into the source code eventually leads to this pull request where it is mentioned that "this is essentially an internal tool" and that "soft keywords are still unused". So what's the purpose of Python's soft keywords?

ruohola
  • 21,987
  • 6
  • 62
  • 97
a_guest
  • 34,165
  • 12
  • 64
  • 118

3 Answers3

76

Short: Soft keywords can still be used as variable or argument names.

PEP 622 sheds some light (Emphasis mine):

The difference between hard and soft keywords is that hard keywords are always reserved words, even in positions where they make no sense (e.g. x = class + 1), while soft keywords only get a special meaning in context.

[...] The match and case keywords are proposed to be soft keywords, so that they are recognized as keywords at the beginning of a match statement or case block respectively, but are allowed to be used in other places as variable or argument names.

couka
  • 1,361
  • 9
  • 16
  • 4
    Or even [PEP 634](https://www.python.org/dev/peps/pep-0634/) which superseded 622 – khelwood Jan 19 '21 at 22:23
  • 6
    PEP 634 contains the same example (`match` and `case`) but does not provide a general explanation of what a soft keyword is. PEP 622 does. – couka Jan 19 '21 at 22:34
  • Didn't `True`, `False`, `as`, and `None` use to be soft keywords? – gerrit Jan 20 '21 at 16:02
  • 2
    In Python 2, `True` and `False` weren't keywords at all: they were *just* identifiers in the built-in scope. Neither `as` nor `None` have even been valid identifier names. (At least, I *think* `as` has been a keyword from the beginning, as part of the `import` statement. If it was ever a valid identifier, it was sometime prior to Python 1.5.) – chepner Jan 20 '21 at 16:08
  • Soft keywords are, IIUC, a side effect of the PEG parser being a backtracking parser. A token can be identifier in attempt at parsing, but a keyword in another. PEP-622 provides the example `match [x,y]:`. `match` would first be parsed as an identifier, until the `:` is encountered, at which point the parse would fail. It then backtracks and tries the alternative of parsing `match` as a keyword. – chepner Jan 20 '21 at 16:13
  • @chepner I seem to recall getting a `SyntaxWarning` for assigning to `as`, but that could have been way back with Python 1.5. I'm quite sure assigning to `None` was also possible at the time. – gerrit Jan 20 '21 at 16:16
  • 5
    @chepner See [this issue from 2003](https://bugs.python.org/issue691733), which was actually by myself — assigning to `None` was definitely a `SyntaxWarning` in 2003, and assigning to `as` was too up to and including Python 2.5. – gerrit Jan 20 '21 at 16:19
  • That said, just because you *can* doesn't mean you *should*. Using something that *could* be a keyword but isn't in based on context is likely to be confusing to whoever has to look at the code later, I would generally call it bad practice to to so. (Not sure about this case, but in other languages, this is often allowed when new keywords are added, in order to prevent breaking legacy code, but it's still a bad idea to use them if you can avoid it.) – Darrel Hoffman Jan 20 '21 at 19:20
  • 2
    @gerrit: `True`/`False`/`None` could never be soft keywords in the sense used here; they're expression-level, so they're legal in almost all contexts; I suppose you could allow them as attribute names (`.` context removes ambiguity), but it's fairly limited. It's only stuff like `async`, `class`, `def`, `as`, `with`, etc. that are all at least statement level, occurring in specific locations in the statement, where soft-ness would meaningfully limit keywordiness (none of them *are*, but some of them could theoretically be made soft with the new PEG parser to enable more expressive grammar). – ShadowRanger Jan 21 '21 at 00:24
  • 3
    also called [contextual keywords](https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/contextual-keywords) in C++ and C#, and [Context-sensitive keywords](https://learn.microsoft.com/en-us/cpp/extensions/context-sensitive-keywords-cpp-component-extensions?view=msvc-160) in C++/CLI – phuclv Jan 21 '21 at 02:10
39

I think this is best explained with a demo. async and await were soft keywords in Python 3.5 and 3.6, and thus they could be used as identifiers:

>>> async = "spam"
>>> async def foo():
...     pass
...
>>> await = "bar"
>>> async, await
('spam', 'bar')

But in Python 3.7 they became proper keywords and can only be used in specific contexts where they make sense:

>>> async = "123"
  File "<stdin>", line 1
    async = "123"
          ^
SyntaxError: invalid syntax
>>> async def foo():
...     pass
...
>>> await = "bar"
  File "<stdin>", line 1
    await = "bar"
          ^
SyntaxError: invalid syntax
>>> async, await
  File "<stdin>", line 1
    async, await
         ^
SyntaxError: invalid syntax

The idea in first introducing them as soft keywords was mainly to not break any existing code that uses them as identifiers. The same reasoning is with the upcoming match keyword which would completely break for example re.match and millions of projects.

ruohola
  • 21,987
  • 6
  • 62
  • 97
  • 1
    Indeed I forgot about `async` and `await` being soft keywords back then (though they were not included in the `keyword` module). I got the impression that soft keywords are made possible by the new PEG parser, so do you know how they realized `async` and `await` back in Python 3.6? – a_guest Jan 21 '21 at 14:00
  • 2
    They were implemented with [a tokenizer (lexer) hack](https://benjam.info/blog/posts/2019-09-18-python-deep-dive-tokenizer/#async-and-await). The tokenizer does the lookahead and maintains context about if you're inside a function, instead of the parser, and returns a special token for async and await. That solution didn't work for async comprehensions defined outside of async functions (invalid in 3.6, but valid in 3.7), and wasn't a scalable solution for future soft keywords. – bgw Mar 02 '21 at 05:01
23

Soft keywords are keywords that are context sensitive. For example, it'd allow you to use class as a variable name as long as it can't be interpreted as defining a class. It'd allow use to replace cls with class for example.

Today that's not possible, since class is a keyword:

>>> def a(class):
  File "<stdin>", line 1
    def a(class):
          ^

Given the context it's clear that the user didn't intend to define a new class, but wanted an identifier named class.

MatsLindh
  • 49,529
  • 4
  • 53
  • 84
  • 10
    More importantly, it works in the other direction, too: it let's you *add* a keyword to the language without invalidating its use as a variable name in existing code. One reason why the assignment operator `:=` exists is because no suitable re-use of an existing keyword (`as`) could be agreed on. My personal preferance would have been `let x = 3 in x * 2` as opposed to `(x:=3) + 2`, but an extremely high bar is set for adding keywords because of the risk of breaking existing code. – chepner Jan 20 '21 at 14:50
  • (To be clear, I'm not sure such a `let` expression was explicitly considered, but the new-keyword aspect would have been at least as big an issue as adding yet another meaning to the `in` keyword.) – chepner Jan 20 '21 at 14:51