0

I've noticed that while this does not work:

class MyClass:
    def __init__(self, i):
        self.5i = i

myobj = MyClass(5)

print(myobj.5i)

This does work:

class MyClass:
    def __init__(self, i):
        setattr(self, '5i', i)

myobj = MyClass(5)

print(getattr(myobj, '5i'))

Even more, this also works:

class MyClass:
    def __init__(self, i):
        self.__dict__['i.j'] = i

myobj = MyClass(5)

print(myobj.__dict__['i.j'])

Are snippets 2 and 3 guaranteed to work? Or do we rather have nasal demons here whose evil plan to bring innocent souls to hell is to make shenanigans of this kind seem to work, while in fact, they don't?

The reason I'm asking is that I'm converting JSON objects (received from an API) to classes. And since the server likes returning additional, undocumentend fields that sometimes do seem important, and I don't like loosing data... This is what most JSON-to-object conversions currently boil down to:

class FooResponse:
    def __init__(self, json_string):
        self.__dict__.update(json.loads(json_string))

But, but... What if the server returns a JSON with some very weird fields, like, for example:

{
    "foo.bar": "Fizz-Buzz",
    "1337": "speek"
}

Or whatever of this sort? Do I have to try my darndest to sanitize this or is simple self.__dict__.update(json.loads(json_string)) appropriate?

The server documents accepting JSON calls with field names that contain dots or are string representations of numbers, so I find it possible thatit might return such a JSON as well.

  • Related, not necessarily a duplicate: https://stackoverflow.com/questions/26534634/attributes-which-arent-valid-python-identifiers – manveti Feb 13 '19 at 00:16
  • You should *always* sanitize input from untrusted sources. https://xkcd.com/327/ – torek Feb 13 '19 at 00:35
  • Well, that's where the judgment thing comes in: who do you trust? Is your API a "trusting" API, i.e., your user knows all and only gives you good inputs, or is yours a "suspicious of bad input" API, i.e., your user may be ignorant or malicious? Some APIs are mixed: you might have, for instance, methods `set_trusted` and `set_untrusted` (probably with better names). It's just that you mentioned JSON and JSON often—not always, but often—comes from untrusted sources. – torek Feb 13 '19 at 00:43

1 Answers1

1

The __dict__ attribute is just a normal dict, so yes it allows any keys a normal dict allows, which can be any hashable objects. That does not mean self.__dict__.update(json.loads(json_string)) is the right thing to do, however, since not only you would not be able to access abnormally named keys with the simple dot operator, but it would introduce unexpected behaviors or even security risks, since the JSON object, whose content is often externally sourced, can now override any attributes of the object, including internal ones such as __dict__ itself. You should stick to assigning json.loads(json_string) to a regular attribute of the object instead.

blhsing
  • 91,368
  • 6
  • 71
  • 106
  • "You should stick to assigning json.loads(json_string) to a regular attribute of the object instead." - then why am I providing classes and models instead of simply returning (nested) dicts? –  Feb 13 '19 at 00:32
  • To elaborate: `foo.returnedAttributes['bar'].returnedAttributes['baz'].returnedAttributes['fizz']` is... daunting. My whole point was to enable access like `foo.bar.baz.fizz` when the API documents returning a JSON object whose fields include 'bar' and map to an object whose fields include 'baz' and map to an object whose fields include 'fizz'... –  Feb 13 '19 at 00:35
  • If that's the only purpose of the class, then yes, you should not be providing this class to begin with. Using a regular variable to store the JSON object would be enough. – blhsing Feb 13 '19 at 00:35
  • The class also provides numerous convenience methods. –  Feb 13 '19 at 00:36
  • No, you would only need `foo['bar']['baz']['fizz']` to access the nested dict value, which is not really that daunting. – blhsing Feb 13 '19 at 00:37
  • You can use a class to provide convenience methods, but still my point is that the dict storing the JSON object should be a regular attribute of the instance. – blhsing Feb 13 '19 at 00:38
  • Yes but then I can't say `foo['bar'].aggregateBazes()` because `foo['bar']` is now a dict and not an object. Not to mention other methods of this sort whose existence is the very purpose of the API I have to write. –  Feb 13 '19 at 00:38
  • Or should I inherit from a dict then? to be able to both provide dict-like access you say is better and to provide the methods I need to provide? –  Feb 13 '19 at 00:42
  • 1
    I've already outlined the potential downsides of your approach in my answer to your question. Whether the convenience of turning the JSON dict into class attributes is worth the drawbacks is entirely up to you to decide based on your actual usage. – blhsing Feb 13 '19 at 00:42
  • If you want, you can override the `__getattribute__` method to allow access to the JSON dict attribute using the dot operator, which would be a much safer implementation. – blhsing Feb 13 '19 at 00:45
  • Take a look at [Object-like attribute access for nested dictionary](https://stackoverflow.com/questions/38034377/object-like-attribute-access-for-nested-dictionary), – martineau Feb 13 '19 at 01:46
  • @martineau IIUC you're pretty much doing the same: copying an arbitrary dict into `__dict__`. How does this improve anything wrt the problems stated in this answer? –  Feb 13 '19 at 03:51