1

I have two type of datasets, with csv or fixed length data. In csv data the fieldlist is just a list of names, while in fixed length data each field is specified by fieldName and fieldLength. I need a json schema to validate both cases, but after trying several solutions, including these, I am not sure it can be done. Or maybe my understanding of JSON schema is still far from perfect.

json :

{
      "dataset": "csv data",
      "dataFormat": "csv",
      "fieldList": [{
                  "fieldName": "id"
            },
            {
                  "fieldName": "num"
            },
            {
                  "fieldName": "struct"
            }
      ]
}

{
    "dataset": "fixed length",
    "dataFormat": "fixed",
    "fieldList": [{
            "fieldName": "id",
            "fieldLength": 13
        },
        {
            "fieldName": "num"
        },
        {
            "fieldName": "struct"
        }
    ]
}

JSON schema:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "required": [
    "dataset",
    "dataFormat",
    "fieldList"
  ],
  "properties": {
    "dataset": {
      "$id": "#/properties/dataset",
      "type": "string"
    },
    "dataFormat": {
      "$id": "#/properties/dataFormat",
      "type": "string",
      "enum": [
        "csv",
        "fixed"
      ]
    },
    "fieldList": {
      "$id": "#/properties/fieldList",
      "type": "array",
      "additionalItems": true,
      "items": {
        "$id": "#/properties/fieldList/items",
        "type": "object",
        "additionalProperties": true,
        "required": [
          "fieldName"
        ],
        "if": {
          "properties": {
            "dataFormat": {
              "const": "fixed"
            }
          }
        },
        "then": {"items":{
          "required": [
            "fieldLength"
          ]}
        },
        "properties": {
          "fieldName": {
            "$id": "#/properties/fieldList/items/properties/fieldName",
            "type": "string"
          },
          "fieldLength": {
            "$id": "#/properties/fieldList/items/properties/fieldLength",
            "type": "integer"
          }
        }
      }
    }
  }
}

Both documents are positively validated, even if in the "fixed" type only the first item includes the required fieldLength. Any suggestion ?

mre
  • 95
  • 1
  • 7

1 Answers1

1

There are a few things in your schema that can be improved:

  1. The if/then is in the wrong place. Right now, the if looks for a "dataFormat" property inside the "fieldList" items and never finds one. The then likewise tries to enforce the existence of a "fieldLength" property in "fieldList".items.items (and since "fieldList".items is an object and not an array, that would simply be ignored.
  2. You should remove the additionalItems attribute. To quote json-schema.org:

    When items is a single schema, the additionalItems keyword is meaningless, and it should not be used.

  3. You can remove the additionalProperties attribute, since its default value is already true. Another quote from json-schema.org:

    The additionalProperties keyword is used to control the handling of extra stuff, that is, properties whose names are not listed in the properties keyword. By default any additional properties are allowed.

  4. The $id attributes are not adding much value (and are not compatible with the newer Draft 2019-09, where they would only be allowed in the new $anchor keyword). You may want to omit those.

Your main issue is point #1 here. You should be able to achieve what you want by adding something like this at the top level:

"oneOf": [
  {
    "properties": {
      "dataFormat": { "const": "csv" }
    }
  },
  {
    "properties": {
      "dataFormat": { "const": "fixed" },
      "fieldList": {
        "items": {
          "required": ["fieldLength"]
        }
      }
    }
  }
]

The complete schema could look like this:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "required": [
    "dataset",
    "dataFormat",
    "fieldList"
  ],
  "properties": {
    "dataset": {
      "$id": "#/properties/dataset",
      "type": "string"
    },
    "dataFormat": {
      "$id": "#/properties/dataFormat",
      "type": "string",
      "enum": [
        "csv",
        "fixed"
      ]
    },
    "fieldList": {
      "$id": "#/properties/fieldList",
      "type": "array",
      "items": {
        "$id": "#/properties/fieldList/items",
        "type": "object",
        "required": [
          "fieldName"
        ],
        "properties": {
          "fieldName": {
            "$id": "#/properties/fieldList/items/properties/fieldName",
            "type": "string"
          },
          "fieldLength": {
            "$id": "#/properties/fieldList/items/properties/fieldLength",
            "type": "integer"
          }
        }
      }
    }
  },
  "oneOf": [
    {
      "properties": {
        "dataFormat": {
          "const": "csv"
        }
      }
    },
    {
      "properties": {
        "dataFormat": {
          "const": "fixed"
        },
        "fieldList": {
          "items": {
            "required": [
              "fieldLength"
            ]
          }
        }
      }
    }
  ]
}

For completeness sake: you can achieve the same with if/then as well:

"if": {
  "properties": {
    "dataFormat": {
      "const": "fixed"
    }
  }
},
"then": {
  "properties": {
    "fieldList": {
      "items": {
        "required": [
          "fieldLength"
        ]
      }
    }
  }
}
Carsten
  • 2,047
  • 1
  • 21
  • 46
  • Thank you, your solutions works fine ! Unfortunately my use case is a little bit more complex than the one present in my question: the dataset is an array, each item with its own fieldlist. I tried to adapt your "oneOf" solution by adding dataset/items before the fieldlist but with no success... – mre Apr 16 '20 at 13:11
  • @mre maybe add that via an edit to the bottom of your question and I can try to assist you with that as well – Carsten Apr 16 '20 at 13:41
  • @mre you'd have to add `dataset`/`items`/`properties` before `fieldList` I'd imagine. Don't forget that last `properties`. – Carsten Apr 16 '20 at 19:29
  • Thanks again @Carsten, that last `properties` did the trick, everything is working now !!! – mre Apr 17 '20 at 14:06