2

setattr will set names that cannot be used with regular attribute access i.e. obj.name.

>>> from types import SimpleNamespace
>>> my_instance = SimpleNamespace()
>>> setattr(my_instance, 'from', 0)  # works
>>> getattr(my_instance, 'from')
0
>>> my_instance.from
SyntaxError: invalid syntax

How can I check for such names, to avoid using them?

wim
  • 338,267
  • 99
  • 616
  • 750
Shay
  • 1,368
  • 11
  • 17
  • 4
    The attribute name is not really illegal. It is only that `from` is a keyword. – Willem Van Onsem Jun 21 '17 at 18:34
  • 2
    You can't do that; they're invalid as identifiers, but not as strings. Probably the best thing to do is check against a [list of keywords](https://stackoverflow.com/questions/9642087/is-it-possible-to-get-a-list-of-keywords-in-python) and stop them being set in the first place. But maybe your approach is not a good one to start with? – jonrsharpe Jun 21 '17 at 18:34
  • @WillemVanOnsem Yes. But I can see what he means. You can also set names like `.` or `&` which are nonsensical and illegal. The OP should probably just do what Jonrsharpe suggested. – Christian Dean Jun 21 '17 at 18:36
  • 3
    Don't mess with the `__dict__` directly, use `setattr` and `getattr` and cut the gordian knot. – juanpa.arrivillaga Jun 21 '17 at 18:40
  • 1
    BTW, it seems you really just want `types.SimpleNamespace` – juanpa.arrivillaga Jun 21 '17 at 18:41
  • 1
    If invalid attribute names are "sneaking" in, then they shouldn't be attribute names in the first place - they are *data*, and they should be keys or values in a dict proper. I think this is an XY problem. – wim Jun 21 '17 at 19:01

1 Answers1

5

On Python 3,

import keyword

type(key) is str and key.isidentifier() and not keyword.iskeyword(key)

We check that the attribute name is a string, that it fits the format of a Python identifier, and that it's not a keyword. isidentifier doesn't exclude keywords, so the extra check is necessary.

(Yes, isinstance is a thing, but I don't really want to allow str subclasses.)

user2357112
  • 260,549
  • 28
  • 431
  • 505
  • 2
    Confused why you attracted a down-vote here. I was just about to post a link to [the `keyword` module](https://docs.python.org/3/library/keyword.html) myself; that's exactly what it's there for. I mean, yes, you should try to avoid invalid strings as keyword/attribute names in general, but if you don't control your input enough, this mix of whitelisting (must be legal identifier `str`) and blacklisting (but not a `keyword`) is a decent option. My up-vote cancels it out I guess. – ShadowRanger Jun 21 '17 at 18:38
  • 2
    Aye, that was me. The code has false positives, i.e. it identifies strings which aren't necessarily invalid attribute names, and this answer fails to address details about that. Here's [Guido's take](https://mail.python.org/pipermail/python-dev/2012-March/117441.html) on the matter. – wim Jun 21 '17 at 18:40
  • @wim: They're only false positives if your goal is to allow attribute names like `from`. If your goal is to only allow names that could be used with dot notation, as in the question, they're not false positives. You could just as easily argue that objects like `0` are false positives, and that any hashable should be accepted because any hashable can be used as a `__dict__` key. – user2357112 Jun 21 '17 at 18:43
  • @wim: Ah, okay. I'm generally not a fan of allowing attributes that can't be used via `obj.name` syntax since working with them is such a pain (must use `getattr`/`setattr` or `operator.attrgetter`, which is both much slower and uglier); the only time I've ever considered it is when I need to interoperate with JSON using sigils or the like, but that is a valid use case on occasion. That said, I'd argue that your argument makes more sense as a response to the OP (who is trying to exclude valid names), not to this answer (which is doing what the OP asked). – ShadowRanger Jun 21 '17 at 18:44
  • No, I would not argue that. I would argue that anything which getattr/setattr accepts is a valid attribute name. `__dict__.update` should NOT be used. The question is lacking finesse, but that doesn't mean the answer should also lack finesse. – wim Jun 21 '17 at 18:44
  • 1
    @wim: And I would point to the example of `namedtuple`, which [also performs](https://github.com/python/cpython/blob/78d9e58f204ec4e90502b42c3e7d48dcd76ccb80/Lib/collections/__init__.py#L391) an `isidentifier` and `iskeyword` check on the field names you give it. Excluding such names is an entirely valid design choice. – user2357112 Jun 21 '17 at 18:48
  • Giving `namedtuple` as an example of good design is amusing. I would stop short and just say it is "a design choice", since EAFP is usually preferred in Python. Either way, this answer should be expanded to clarify that instance dicts should not be messed with directly, and at the very least mention that arbitrary strings as attributes is a *feature* of Python. – wim Jun 21 '17 at 19:04