1

I need to parse requests to a single url that are coming in JSON, but in several different formats. For example, some have timestamp noted as timestamp attr, others as unixtime etc. So i want to create json schemas for all types of requests that not only validate incoming JSONs but also extract their parameters from specified places. Is there a library that can do that?

Example:

If I could define a schema that would look something like this

schema = {
    "type" : "object",
    "properties" : {
        "price" : {
            "type" : "number",
            "mapped_name": "product_price"
        },
        "name" : {
            "type" : "string",
            "mapped_name": "product_name"

        },
        "added_at":{
            "type" : "int",
            "mapped_name": "timestamp"

        },
    },
}

and then apply it to a dict

request = {
    "name" : "Eggs",
    "price" : 34.99,
    'added_at': 1234567
}

by some magical function

params = validate_and_extract(request, schema)

I want params to have mapped values there:

{"mapped_name": "Eggs", "product_price": 34.99, "timestamp": 1234567}

so this is a module I'm looking for. And it should support nested dicts in request, not just flat dicts.

kurtgn
  • 8,140
  • 13
  • 55
  • 91
  • Can you give an example of input and desired output? – Avihoo Mamka Sep 12 '16 at 09:37
  • The standard [JSONDecoder](https://docs.python.org/2/library/json.html#encoders-and-decoders) of the json library does that. If you want to parse specific inputs such as dates [see this thread](http://stackoverflow.com/questions/8793448/how-to-convert-to-a-python-datetime-object-with-json-loads). – Emilien Sep 12 '16 at 14:51
  • Not aware of any such library in Python. Have you tried writing your own function to do this? Community users will be reluctant to help if you haven't tried it yourself first. – coder.in.me Sep 12 '16 at 15:10

2 Answers2

2

The following code may help. It supports nested dict as well.

import json


def valid_type(type_name, obj):
    if type_name == "number":
        return isinstance(obj, int) or isinstance(obj, float)

    if type_name == "int":
        return isinstance(obj, int)

    if type_name == "float":
        return isinstance(obj, float)

    if type_name == "string":
        return isinstance(obj, str)


def validate_and_extract(request, schema):
    ''' Validate request (dict) against the schema (dict).

        Validation is limited to naming and type information.
        No check is done to ensure all elements in schema
        are present in the request. This could be enhanced by
        specifying mandatory/optional/conditional information
        within the schema and subsequently checking for that.
    '''
    out = {}

    for k, v in request.items():
        if k not in schema['properties'].keys():
            print("Key '{}' not in schema ... skipping.".format(k))
            continue

        if schema['properties'][k]['type'] == 'object':
            v = validate_and_extract(v, schema['properties'][k])

        elif not valid_type(schema['properties'][k]['type'], v):
            print("Wrong type for '{}' ... skipping.".format(k))
            continue

        out[schema['properties'][k]['mapped_name']] = v

    return out


# Sample Data 1
schema1 = {
    "type" : "object",
    "properties" : {
        "price" : {
            "type" : "number",
            "mapped_name": "product_price"
        },
        "name" : {
            "type" : "string",
            "mapped_name": "product_name"

        },
        "added_at":{
            "type" : "int",
            "mapped_name": "timestamp"

        },
    },
}
request1 = {
    "name" : "Eggs",
    "price" : 34.99,
    'added_at': 1234567
}

# Sample Data 2: containing nested dict
schema2 = {
    "type" : "object",
    "properties" : {
        "price" : {
            "type" : "number",
            "mapped_name": "product_price"
        },
        "name" : {
            "type" : "string",
            "mapped_name": "product_name"
        },
        "added_at":{
            "type" : "int",
            "mapped_name": "timestamp"
        },
        "discount":{
            "type" : "object",
            "mapped_name": "offer",
            "properties" : {
                "percent": {
                    "type" : "int",
                    "mapped_name": "percentage"
                },
                "last_date": {
                    "type" : "string",
                    "mapped_name": "end_date"
                },
            }
        },
    },
}
request2 = {
    "name" : "Eggs",
    "price" : 34.99,
    'added_at': 1234567,
    'discount' : {
        'percent' : 40,
        'last_date' : '2016-09-25'
    }
}


params = validate_and_extract(request1, schema1)
print(params)

params = validate_and_extract(request2, schema2)
print(params)

Output from running this:

{'timestamp': 1234567, 'product_name': 'Eggs', 'product_price': 34.99}
{'offer': {'percentage': 40, 'end_date': '2016-09-25'}, 'timestamp': 1234567, 'product_name': 'Eggs', 'product_price': 34.99}
coder.in.me
  • 1,048
  • 9
  • 19
1

See http://json-schema.org

This doesn't look like a Python question.

Laurent LAPORTE
  • 21,958
  • 6
  • 58
  • 103