11

In my schema I declared these properties:

"index_name": {
      "type": "string",
      "examples": ["foo-wwen-live", "foo"]
    },
"locale": {
      "type": "string",
      "examples": ["wwen", "usen", "frfr"]
},
"environment": {
      "type": "string",
      "default": "live",
      "examples": [
        "staging",
        "edgengram",
        "test"
      ]
}

I want a JSON body validated against my schema to be valid only if:

  • index_name is present, and both locale and environment are not present;
  • locale and/or enviroment are present, and index_name is not present

In short, locale and environment should never be mixed with index_name.

Test cases and desired results:

These should pass:
Case #1

{
  "locale": "usen"
}

Case #2

{
  "environment": "foo"
}

Case #3

{
  "environment": "foo",
  "locale": "usen"
}

Case #4

{
  "index_name": "foo-usen"
}

These should NOT pass:
Case #5

{
  "index_name": "foo-usen",
  "locale": "usen"
}

Case #6

{
  "index_name": "foo-usen",
  "environment": "foo"
}

Case #7

{
  "index_name": "foo-usen",
  "locale": "usen",
  "environment": "foo"
}

I created the following rule for my schema, however it does not cover all the cases. For example, if both locale and environment are present, validation returns failure if index_name is also present, which is correct behavior according to case #7. But if only one of locale and environment is present, it allows index_name to also be present (fails at cases #5 and #6).

  "oneOf": [
    {
      "required": ["index_name"],
      "not": {"required":  ["locale", "environment"]}
    },
    {
      "anyOf": [
        {
          "required": ["locale"],
          "not": {"required": ["index_name"]}
        },
        {
          "required": ["environment"],
          "not": {"required": ["index_name"]}
        }
      ]
    }
  ]

I'm getting mixed information on how "not": {"required": []} declaration works. Some people claim this means that it forbids anything declared in the array to be present, in contrary to what idea does the syntax give. Other claim that this should be taken exactly as it sounds: properties listed in the array are not required - they can be present, but it doesn't matter if they aren't.

Apart from this rule, I also require one non-related property to be present in all cases and I set "additionalProperties": false.

What is the rule that would satisfy all my test cases?

Maks Babarowski
  • 652
  • 6
  • 16
  • `not` inverts the result of applying the subschema to the instance. `not` itself doesn't care what the subschema value looks like, just what the assertion result is. – Relequestual Apr 06 '20 at 13:33
  • You listed your requirements near the top, but also it looks like you have an additional requirement at the end. Is this correct? If so, could you move the requrement at the end and have all your requirements in one place please? I'll then be able to provide you a solution. – Relequestual Apr 06 '20 at 13:34
  • I don't have any other requirements except for those mentioned in the question. All of them look like this: https://pastebin.com/Qk070u4e and are on the same level as "properties" node. – Maks Babarowski Apr 06 '20 at 14:08
  • I'm confused by "Apart from this rule, I also require one non-related property to be present in all cases and I set "additionalProperties": false." because `additionalProprties: false` prevents additional properties. – Relequestual Apr 06 '20 at 14:09
  • OK. I see. I understand. I'll be able to provide you an answer in about an hours time =] – Relequestual Apr 06 '20 at 14:11
  • Fine everyone beat me to it... =/ – Relequestual Apr 06 '20 at 15:11

2 Answers2

19

Dependencies

This is a job for the dependencies keyword. The following says

  • if "locale" is present, then "index_name" is forbidden.
  • if "environment" is present, then "index_name" is forbidden.

|

"dependencies": {
  "locale": { "not": { "required": ["index_name"] } },
  "environment": { "not": { "required": ["index_name"] } }
}

What's up with not-required?

There's a sub question about how not-required works. It's confusing because it doesn't mean how it reads in English, but it's similar enough to make us think it does sometimes.

In the above example, if we read it as "not required", it sounds like it means "optional". A more accurate description would be "forbidden".

That's awkward, but not too bad. Where it gets confusing is when you want to "forbid" more than one property. Let's assume we want to say, if "foo" is present, then "bar" and "baz" are forbidden. The first thing you might try is this.

"dependencies": {
  "foo": { "not": { "required": ["bar", "baz"] } }
}

However, what this says is that if "foo" is present, then the instance is invalid if both "bar" AND "baz" are present. They both have to be there to trigger failure. What we really wanted is for it to be invalid if "bar" OR "baz" are present.

"dependencies": {
  "foo": {
    "not": {
      "anyOf": [
        { "required": ["bar"] },
        { "required": ["baz"] }
      ]
    }
  }
}

Why is this so hard?

JSON Schema is optimized for schemas that are tolerant to changes. The schema should enforce that the instance has a the necessary data to accomplish a certain task. If it has more than it needs, the application ignores the rest. That way, if something is add to the instance, everything still works. It shouldn't fail validation if the instance has a few extra fields that the application doesn't use.

So, when you try to do something like forbidding things that you could otherwise ignore, you're going a bit against the grain of JSON Schema and things can get a little ugly. However, sometimes it's necessary. I don't know enough about your situation to make that call, but I'd guess that dependencies is probably necessary in this case, but additionalProperties is not.

Jason Desrosiers
  • 22,479
  • 5
  • 47
  • 53
  • On the first glance I though that your assumptions are missing this point: 'if "index_name" is present, nor "locale" or "environment" is allowed'. However after trying out the rule you provided I came to the conclusion, that this rule is the logical consequence of two rules applied by you. Thank you for your in-detail explanation on how to use `not`-`required`. I've seen it being used multiple times, but never against multiple fields. Also, thank you for your insights on my use case, I'll revise my requirements. – Maks Babarowski Apr 06 '20 at 16:14
  • On your "why is this so hard" explanation, I wanted to first thank you for including it and then say that having come from an XML world I find the unrestrictive default for JSON schema quite vexing. We have complex JSON configuration schema and want any derived JSON document to conform in a restrictive manner to it, so the reader doesn't get confused by unexpected "trash" content. It is quite bizarre to be unrestrictive in this context but there's no easy way to simply turn on restrictivity (which would be a great feature IMHO). – Phil W Aug 25 '21 at 18:39
  • @PhilW I would argue that extra properties should be considered a warning, not a validation error. Ideally there would be a linting process telling you that your configuration has extra properties. Your configuration is still valid. It will work. That's the job of the validator to determine. Whether or not your configuration is clean and free of "trash" is the job of a linter. Unfortunately, there hasn't been much work in linting JSON based on a JSON Schema, but that's the missing piece IMO. – Jason Desrosiers Aug 25 '21 at 20:12
  • @jasondesrosiers I get where you are coming from. My counter argument is that the user may expect different behaviour because they didn't notice a typo in a property name. Indeed, we have had this so many times because JSON schema is suggestive rather than restrictive. A real failing by the spec IMHO. – Phil W Aug 25 '21 at 20:49
  • I don't see how that argument changes anything. I'm not denying that permissiveness is problematic in some domains (such as configuration validation). In the those situations, users need feedback beyond the scope of validation. A linter would provide that needed feedback. The lack of linter tooling available has nothing to do with any failing of the spec. It's just a tooling gap. – Jason Desrosiers Aug 25 '21 at 21:19
  • The "domain" of interest to me is configuration. I just wish I could flip a switch (with a simple setting in the schema) to make it restrictive. The amount of bloat in my schema to try to make it restrictive is huge. – Phil W Aug 25 '21 at 21:27
  • I think you're missing my point. You shouldn't have to change anything in your schema to go from permissive to restrictive. You need tooling that gives you restrictive feedback. However, I know that the necessary tooling is not available to you and the only option you are left with is to write horribly complex and bloated schemas in order to simulate the functionality that the tooling is lacking. That sucks and I'm sorry you have to do that. I just don't think that's a spec problem. It's a tooling gap problem. – Jason Desrosiers Aug 25 '21 at 21:58
  • A schema should be complete and not require the addition of further tooling to cover validation aspects not specified in that schema. That's why I would want an explicit control in the schema to say "this is restrictive". To allow typos and structure mistakes to go unnoticed without having an additional tool that makes assumptions about your schema's meaning is, to me, wrong. – Phil W Aug 26 '21 at 05:48
  • can we decide presence of a property depending on the value of some other property? – Afzal Ali Mar 31 '23 at 11:28
1

since required: [a, b] means (a must be present AND b must be present)

then not: {required: [a, b]} means (NOT (a must be present AND b must be present))

which is logically equivalent to (a must NOT be present OR b must NOT be present).

so that is not the correct expression to say that (a must NOT be present AND b must NOT be present). you need two nots.

here is the correct expression, given your requirements:

{
  "oneOf": [
    {
      "required": ["index_name"],
      "allOf": [
        {"not": {"required": ["locale"]}},
        {"not": {"required": ["environment"]}}
      ]
    },
    {
      "anyOf": [
        {"required": ["locale"]},
        {"required": ["environment"]}
      ],
      "not": {
        "required": ["index_name"]
      }
    }
  ]
}
Ethan
  • 595
  • 2
  • 9
  • there are various ways to refactor this with logical equivalence, e.g. the first oneOf's locale/environment could be equivalently restricted by {not: {anyOf: [{required: [locale]} , {required: [environment]}]} – Ethan Apr 06 '20 at 14:51