5

How may we validate multiple refs in a schema using jsonschema.RefResolver?

I have a validation script that works good if I have one ref in a file. I now have two or three refs in a schema, that are in a different directory.

base_dir = '/schema/models/'
with open (os.path.join(base_dir, 'Defined.json')) as file_object:
    schema = json.load(file_object)
    resolver = jsonschema.RefResolver('file://' + base_dir + '/' + 'Fields/Ranges.json', schema)
    jsonschema.Draft4Validator(schema, resolver=resolver).validate(data)

My json schema:

{
  "properties": {
    "description": {
        "type": "object",
        "after": {"type": ["string", "null"]},
        "before": {"type": "string"}
      },
      "width": {"type": "number"} ,
      "range_specifier": {"type": "string"},
      "start": {"type": "number", "enum" : [0, 1] } ,
      "ranges": {
        "$ref": "Fields/Ranges.json"
      },
      "values": {
        "$ref": "Fields/Values.json"
      }
  }
}

So my question is should I have two resolvers one for ranges and one for values and call the resolvers separately in Draft4Validator ? Or is there a better way to do this?

repop_rev
  • 61
  • 1
  • 5

4 Answers4

9

I've spent several hours on the same issue myself so I hope that this workaround is useful for others

def validate(schema_search_path, json_data, schema_id):
    """
    load the json file and validate against loaded schema
    """
    try:
        schemastore = {}
        schema = None
        fnames = os.listdir(schema_search_path)
        for fname in fnames:
            fpath = os.path.join(schema_search_path, fname)
            if fpath[-5:] == ".json":
                with open(fpath, "r") as schema_fd:
                    schema = json.load(schema_fd)
                    if "id" in schema:
                        schemastore[schema["id"]] = schema

        schema = schemastore.get("http://mydomain/json-schema/%s" % schema_id)
        Draft4Validator.check_schema()
        resolver = RefResolver("file://%s.json" % os.path.join(schema_search_path, schema_id), schema, schemastore)
        Draft4Validator(schema, resolver=resolver).validate(json_data)
        return True
    except ValidationError as error:
        # handle validation error 
        pass
    except SchemaError as error:
        # handle schema error
        pass
    return False

Every JSON schema that should be used in path resolution has an ID element that must be passed to validate as schema_id argument

  "id": "http://mydomain/json-schema/myid"

All the schema are loaded in a dict and then passed to the resolver as a store. In your example you should also load the schema from the other directory.

s1m0
  • 91
  • 1
  • 2
  • This is the best answer I've seen so far. I went a bit further by avoiding initializing the two first parameters of the resolver, which require the id to be a url: resolver = jsonschema.RefResolver("", "", schemastore) works well. It protects from having KeyError exceptions when the code tries to de-reference the other schemas in the schemastore. – Laurent Le Meur May 03 '20 at 11:02
2

I'm storing my entire schema in a YAML file named api-spec.yaml in the root of a Python package. The YAML file conforms to Swagger 3.0. It should be trivial to retool this example to validate any object described in the root schema.

Note: this example requires you to execute the following from the command line:

pip install pyyaml, jsonschema

In a package named cn the YAML file is loaded in __init__.py:

import yaml, os
__all__ = ['api_swagger']
with open(os.path.join(os.path.dirname(os.path.abspath(__file__)), 'api-spec.yaml')) as file:
    api_swagger = yaml.load(file, Loader=yaml.SafeLoader)

And here's some example code in a Python unit test:

import unittest
from cn import api_swagger


class ValidateSwaggerSchema_Tests(unittest.TestCase):

        def test_validation_using_jsonschema_RefResolver___should_pass(self):
            # setup
            from cn import api_swagger
            schemastore = {
                '': api_swagger,
            }
            resolver = jsonschema.RefResolver(
                base_uri='',
                referrer=api_swagger,
                store=schemastore)

            # setup: the schemas used when validating the data
            ParsedMessageSchema = api_swagger['components']['schemas']['ParsedMessage']
            FillSchema = api_swagger['components']['schemas']['Fill']

            # validate: ParsedMessageSchema
            validator = jsonschema.Draft7Validator(ParsedMessageSchema, resolver=resolver)
            validator.validate({
                'type': 'fill',
                'data': [{
                    'order_id': 'a_unique_order_id',
                    'exchange_id': 'bittrex',
                    'market_id': 'USD-BTC',
                    'side': 'sell',
                    'price': 11167.01199693,
                    'amount': 0.00089773,
                    'type': 'limit',
                    'status': 'filled',
                    'leaves_amount': 0.0,
                    'cumulative_amount': 0.00089773,
                    'timestamp': '2019-07-13T20:17:01.480000',
                }]
            })

            # validate: FillSchema
            validator = jsonschema.Draft7Validator(FillSchema, resolver=resolver)
            validator.validate(
                {
                    'order_id': 'a_unique_order_id',
                    'exchange_id': 'bittrex',
                    'market_id': 'USD-BTC',
                    'side': 'sell',
                    'price': 11167.01199693,
                    'amount': 0.00089773,
                    'type': 'limit',
                    'status': 'filled',
                    'leaves_amount': 0.0,
                    'cumulative_amount': 0.00089773,
                    'timestamp': '2019-07-13T20:17:01.480000',
                }
            )
MikeyE
  • 1,756
  • 1
  • 18
  • 37
0

You can use the python library "pyjn" (pip install pyjn) and do this in three lines.

from pyjn import pyjn
pyjn=pyjn()
json_pathtest='C:/Users/Entity.json'
print(pyjn.refsolver(json_pathtest))

0

I used the full URI in every schema's "$id" and every "$ref" and did not have to bother with relative paths when using a RefResolver.

Schema A, found in <repo-root>/my-schemas/A.schema.json:

{
    "$id": "https://mycompany.com/my-schemas/A.schema.json",
    ...
}

Schema B, found in <repo-root>/my-schemas/one-more-level/B.schema.json:

{
    "$id": "https://mycompany.com/my-schemas/one-more-level/B.schema.json",
    ...
    {
        "$ref": "https://mycompany.com/my-schemas/A.schema.json"
    }
    ...
}

Then, I load all my schemas into a store with their full ID as the key:

schema_store: Dict[str, dict] = {}
for path_, _, files in walk("<repo-root>/my-schemas/"):
    for file in files:
        if file.endswith(".schema.json"):
            absfile = path.join(path_, file)
            with open(absfile) as f:
                schema = load(f)
                schema_store[schema["$id"]] = schema

Then, validate instances against e.g B with a RefResolver (for which documentation is scarce):

schema = schema_store["https://mycompany.com/my-schemas/one-more-level/B.schema.json"]
validator = Draft202012Validator(
    schema=schema,
    resolver=RefResolver(base_uri="", referrer=schema, store=schema_store),  # type: ignore
)
validator.validate(instance={ ... })
Ayberk Özgür
  • 4,986
  • 4
  • 38
  • 58