2

I'm using the schema library.

How can I create a schema to validate if a dictionary contains anyone of the keys and corresponding values in it?

mydict_schema = Schema({
    Optional('name'): str,
    Optional('name_id'): int,
})

At the moment the keys are all Optional, but I want there to be at least one of them.

dreftymac
  • 31,404
  • 26
  • 119
  • 182

3 Answers3

0

Context

  • python2
  • validation with schema library

Use-case

  • DevSyedK wants to create a schema validation constraint that requires a dictionary to have at least one key from a set of possible keys
  • DevSyedK currently has a ZeroOrMore constraint, but DevSyedK wants it to be a OneOrMore constraint

Solution

  • Establish two lists, one list with all possible keys, and the other list with the actual keys contained in the data to be validated
  • Create a schema constraint that returns True if and only if the intersection of the two lists is non-empty

Demo code

  • Note: this is not a complete solution to the question, just a proof-of-concept.

      lstkeys_possible  = ['alpha','bravo','charlie']
      lstkeys_actual    = []   ## wont validate
      lstkeys_actual    = ['zulu']  ## wont validate
      lstkeys_actual    = ['alpha']  ## will validate
      Schema( lambda vinput: bool(set(vinput[0]) & set(vinput[1])) ).validate( [lstkeys_possible,lstkeys_actual] )
      

See also

dreftymac
  • 31,404
  • 26
  • 119
  • 182
0

I ran into this problem too so posting a solution for future reference. One way to accomplish this is to create a subclass with some custom logic.

class AtLeastOneSchema(Schema):
    one_required = {'name', 'name_id'}

    def validate(self, data, **kwargs):
        val_schema = Schema(
            self.schema,
            self._error,
            self._ignore_extra_keys,
            self._name,
            self._description,
            self.as_reference,
        )
        # This part validates your schema like normal
        # required to avoid recursive calls to this exact
        # validate method
        rv = val_schema.validate(data, **kwargs)

        # Now to the custom logic to ensure one of the keys exists
        e = self._error
        found_one = False
        for key in data:
            if key in self.one_required:
                found_one = True
        if not found_one:
            message = (
                f"Missing key from {self.one_required}"
            )
            raise SchemaError(message, e.format(data) if e else None)
        return rv

Now you can create your schema with the new subclass

mydict_schema = AtLeastOneSchema({
    Optional('name'): str,
    Optional('name_id'): int,
    Optional(str): str  # so that you can add any other arbitrary data
})

Example usage:

>>> good_data = {'name': "john"}
>>> mydict_schema.validate(good_data)
{'name': 'john'}

>>> bad_data = {"foo": "bar"}
>>> mydict_schema.validate(bad_data)
Traceback (most recent call last):
schema.SchemaError: Missing key from {'name', 'name_id'}
Hass
  • 1
0

The Use function allows to call a function with the data as currently being validated. And the And function allows for two or more constructs to be validated in turn. Combining these you can define a function to validate the data "in-flight" and raise an exception if you're not happy with what you see:

from schema import Schema, Optional, Use, And, SchemaMissingKeyError

def validate_mydict_schema(data):
    if not len(data):
        raise SchemaMissingKeyError("specify at least one of 'name' and 'name_id'")
    return data

mydict_schema = Schema(And({
    Optional('name'): str,
    Optional('name_id'): int
}, Use(validate_mydict_schema)))
>>> mydict_schema.validate({})
SchemaError: specify at least one of 'name' and 'name_id'

>>> mydict_schema.validate({"name": "John"})
{'name': 'John'}

One thing that's really cool is that you can even modify the data. For example you could look-up 'name' in case only 'name_id' is given and return it such that the validation result is always guaranteed to also include the 'name' key.

def validate_mydict_schema(data):
    if not len(data):
        raise SchemaMissingKeyError("one of keys 'name' and 'name_id' is required")
    if "name" not in data:
        data["name"] = "insert lookup using data['name_id'] here..."
    return data
>>> mydict_schema.validate({"name_id": 1}).keys()
dict_keys(['name_id', 'name'])
Hans Bouwmeester
  • 1,121
  • 1
  • 17
  • 19