12

How do I define the schema in colander for JSON of the following form?

{
    'data' : {
        'key_1' : [123, 567],
        'key_2' : ['abc','def'],
        'frank_underwood' : [666.66, 333.333],
        ... etc ...
    }
}

The keys inside 'data' could be any string and values are arrays.

Currently, I have the following but it doesn't really put any constraints on the types of values the mapping can have.

class Query(colander.MappingSchema):
    data = colander.SchemaNode(
        colander.Mapping(unknown='preserve'),
        missing={}
    )

What's the proper way of describing this?

XiaoChuan Yu
  • 3,951
  • 1
  • 32
  • 44

3 Answers3

6

A possible solution is to use a custom validator.

Here is a full working example of a custom validator that checks if all values of an arbitrary map are singularly typed arrays.

import colander


def values_are_singularly_typed_arrays(node, mapping):
    for val in mapping.values():
        if not isinstance(val, list):
            raise colander.Invalid(node, "one or more value(s) is not a list")
        if not len(set(map(type, val))) == 1:
            raise colander.Invalid(node, "one or more value(s) is a list with mixed types")

class MySchema(colander.MappingSchema):
    data = colander.SchemaNode(
        colander.Mapping(unknown='preserve'),
        validator=values_are_singularly_typed_arrays
    )

def main():
    valid_data = {
        'data' : {
            'numbers' : [1,2,3],
            'reals' : [1.2,3.4,5.6],
        }
    }
    not_list = {
        'data' : {
            'numbers' : [1,2,3],
            'error_here' : 123
        }
    }
    mixed_type = {
        'data' : {
            'numbers' : [1,2,3],
            'error_here' : [123, 'for the watch']
        }
    }

    schema = MySchema()
    schema.deserialize(valid_data)

    try:
        schema.deserialize(not_list)
    except colander.Invalid as e:
        print(e.asdict())

    try:
        schema.deserialize(mixed_type)
    except colander.Invalid as e:
        print(e.asdict())

if __name__ == '__main__':
    main()
XiaoChuan Yu
  • 3,951
  • 1
  • 32
  • 44
1

I don't know about colander but you could use Spyne.

class Data(ComplexModel):
    key_1 = Array(Integer)
    key_2 = Array(Unicode)
    frank_underwood = Array(Double)

class Wrapper(ComplexModel):
    data = Data

Full working example: https://gist.github.com/plq/3081280856ed1c0515de

Spyne's model docs: http://spyne.io/docs/2.10/manual/03_types.html


However, turns out that's not what you need. If you want a more loosely-specified dictionary, then you need to resort to using a custom type:

class DictOfUniformArray(AnyDict):
    @staticmethod  # yes staticmethod
    def validate_native(cls, inst):
        for k, v in inst.items():
            if not isinstance(k, six.string_types):
                raise ValidationError(type(k), "Invalid key type %r")
            if not isinstance(v, list):
                raise ValidationError(type(v), "Invalid value type %r")
            # log_repr prevents too much data going in the logs.
            if not len(set(map(type, v))) == 1:
                raise ValidationError(log_repr(v),
                                      "List %s is not uniform")
        return True

class Wrapper(ComplexModel):
    data = DictOfUniformArray

Full working exaple: https://github.com/arskom/spyne/blob/spyne-2.12.5-beta/examples/custom_type.py

Burak Arslan
  • 7,671
  • 2
  • 15
  • 24
  • Does this allow the keys to change inside "data"? It seems this solution assumes you know ahead of time "key_1", "key_2" etc are keys inside "data". – XiaoChuan Yu Jul 19 '15 at 17:10
  • Yes, as that's an object definition. How'd you otherwise know which type goes where? – Burak Arslan Jul 19 '15 at 21:45
  • The problem described in my question is that you do NOT know the complete schema of the object at runtime. The only constraint is the keys inside 'data' could be any string and the values are arrays. A simple static schema which you outlined here will not suffice. – XiaoChuan Yu Jul 20 '15 at 03:51
  • Ah. So you need something along the lines of `Dict(Unicode, Array(Any))` right? That's not supported by Spyne yet, but it's in the to-do list. – Burak Arslan Jul 20 '15 at 13:32
  • Yea `Dict(Unicode, Array(Any))` would have been perfect. It seems this feature is missing in both Colander as well as in Spyne right now (unless I missed something of course). I took a quick look at Spyne and it looks like a very good alternative to Colander. Thanks for the answer! – XiaoChuan Yu Jul 20 '15 at 18:35
1

Here's another way I found.

from colander import SchemaType, Invalid
import json

class JsonType(SchemaType):
   ...
   def deserialize(self, node, cstruct):
    if not cstruct: return null
    try:
        result = json.loads(cstruct, encoding=self.encoding)
    except Exception as e:
        raise Invalid(node,"Not json ...")
    return result

How to create a "type-agnostic" SchemaNode in colander

Community
  • 1
  • 1
slashdottir
  • 7,835
  • 7
  • 55
  • 71