250

We have this json schema draft. I would like to get a sample of my JSON data and generate a skeleton for the JSON schema, that I can rework manually, adding things like description, required, etc, which can not be infered from the specific examples.

For example, from my input example.json:

{
    "foo": "lorem", 
    "bar": "ipsum"
}

I would run my json_schema_generator tool and would get:

{ "foo": {
    "type" : "string",
    "required" : true,
    "description" : "unknown"
  },
  "bar": {
    "type" : "string",
    "required" : true,
    "description" : "unknown"
  }
}

This example has been coded manually, so it may have errors. Is there any tool out there which could help me with the conversion JSON -> JSON schema?

Simeon Leyzerzon
  • 18,658
  • 9
  • 54
  • 82
blueFast
  • 41,341
  • 63
  • 198
  • 344
  • But how would the tool know that it is not a generic map from strings to strings? – hmakholm left over Monica Sep 07 '11 at 23:21
  • 1
    In the example provided, I would say it is clear that we have a dictionary (python terminology), with key-value pairs, where the values happen to be strings. I do not know of any other JSON schema that would describe the same data. And this is just an easy example: it could get much more complicated, of course, as specified in the JSON schema draft. – blueFast Sep 08 '11 at 00:18
  • 3
    So you're claiming that "map from arbitrary strings to other arbitrary strings" (such as a mapping from file names to descriptions of the content) cannot be expressed as a JSON schema? For all I know, that may be true, but it would make that kind of schemata rather useless in my view. – hmakholm left over Monica Sep 08 '11 at 00:26
  • 1
    Mmmm, I am not sure we are discussing something relevant to the question, but anyway. Let's use a better example: having fixed keys in the JSON data is definitely useful if that JSON data is, for example, describing properties of a person. Instead of "foo" and "bar", think about "name", and "surname". "name" and "surname" are clearly fixed properties of the person JSON data, so they are not arbitrary strings: they are part of the person schema. The values are of course arbitrary, so they are not part of the schema. – blueFast Sep 08 '11 at 05:10
  • 3
    Having fixed keys is sometimes what you want, and sometimes it isn't. That's the entire point in fact: there's no way an automated tool can detect from at single sample which of the options you want. – hmakholm left over Monica Sep 08 '11 at 11:54
  • I see what you mean. Let's say all key names are considered by default being fixed: a tool could work that way. Then it would produce the skeleton of the JSON schema, using the data types inferred from the JSON data. Most of the information would be of course arbitrary (the tool can not know about most things - is it required, what is the description?), but I would still find value in having the skeleton produced for me, filled with dummy values, even if I have to edit it heavily. – blueFast Sep 08 '11 at 12:14
  • @HenningMakholm, a set of arbitrary pairs of strings (such as filename: description) would more logically be expressed as a list than a mapping: `{"type":"array","items":{["string","string"]}}`. I would say that fixed keys are nearly always what is intended with objects - the very word "properties" carries with it the implication that a property has a given name and a value with property-specific syntax. – Dave Mar 26 '16 at 02:41
  • @Dave: That doesn't seem to be the case in the _schema_ language being employed here, though. – hmakholm left over Monica Mar 26 '16 at 02:50
  • If you know for a fact that an object is being used to carry arbitrary pairs, then you can tell the schema generator to use `patternProperties` instead of `properties`. I have run across json data like that; the designer used numbers for property names that were arbitrary. And I have an extension to `GenSON` that generates `patternProperties` if you give it a regex matching the properties to be treated as arbitrary. – Dave Mar 26 '16 at 02:53
  • you probably want this http://www.jsonschema2pojo.org/ It's the best to create pojo's from API documentation ! – Someone Somewhere Mar 30 '20 at 04:27
  • 1
    I wouldn't have voted the question off-topic. If you're a programmer, it's a great question. – Someone Somewhere Mar 30 '20 at 04:28
  • Try this tool, I've been using it for few months https://debug.center/json-schema-generator – Sailesh Kotha Aug 09 '20 at 04:06

12 Answers12

160

Summarising the other answers, here are the JSON schema generators proposed so far:

Online:

Python:

NodeJS:

Ruby:

nirsky
  • 2,955
  • 3
  • 22
  • 35
Steve Bennett
  • 114,604
  • 39
  • 168
  • 219
  • jskemetor - no `setup.py` – Att Righ Aug 14 '17 at 12:59
  • Any chance you know if any of these support YAML inputs? We could convert, but just an extra step. – DylanYoung Oct 08 '19 at 14:46
  • Python: only genson are maintained ^), easy-json-schema works the same as genson and it doesn't have symbols limit like other online tools – MoonRaiser Dec 20 '22 at 09:34
  • https://www.liquid-technologies.com/online-json-to-schema-converter -> Did create a exact schema for my input JSON, all existing fields are required - the schema is about 3000 lines https://easy-json-schema.github.io/ -> Did simplify a repeating pattern(array), schema is about 70 lines. No fields are required. (But that can be added by adding a * ..) – Benjamin Feb 07 '23 at 14:13
98

You might be looking for this:

http://www.jsonschema.net

It is an online tool that can automatically generate JSON schema from JSON string. And you can edit the schema easily.

Green Su
  • 2,318
  • 2
  • 22
  • 16
  • 4
    An easy and handy place to start. But note reported issues with jsonschema.net identified elsewhere on this page, and the reasons discussed for wanting an offline, or at least API-accessible, tool to include in development workflows, allow updating of schemas with later example etc. See also the nice list of options by Steve Bennett. – nealmcb Oct 26 '17 at 19:26
  • Please note that this site will throw unexpected errors when editing the schema after the initial import. – Coreus Nov 13 '17 at 11:22
  • 1
    Crashes for something like `{"hello": "world","num": 42}` but looks promising- – DBX12 Feb 02 '18 at 11:24
  • 5
    The old sites were definitely not good enough. [JSONSchema.Net](https://JSONSchema.Net) has now been rewritten. It's much more robust. If you have any issues, please report them on GitHub and I'll gladly fix them: https://github.com/jackwootton/json-schema – Jack Feb 21 '18 at 08:17
  • http://www.jsonschema2pojo.org/ is what I've been using for years – Someone Somewhere Mar 30 '20 at 04:29
  • 2
    Warning - This site now has a login wall unfortunately :( – user2085368 Nov 27 '21 at 21:11
49

GenSON (PyPI | Github) is a JSON Schema generator that can generate a single schema from multiple objects. You can also merge schemas with it. It is written in Python and comes with a CLI tool.

(Full disclosure: I'm the author.)

wolverdude
  • 1,583
  • 1
  • 13
  • 20
  • 1
    Nice work, man! I regret not finding this before I started to work on skinfer: https://github.com/scrapinghub/skinfer – Elias Dorneles Sep 23 '15 at 12:28
  • 1
    Not a python, but here's another one https://github.com/snowplow/schema-guru – chuwy Sep 25 '15 at 12:21
  • 1
    Great! I've been disappointed with the online schema generator http://www.jsonschema.net (it fails to create "required" properties for most objects, has no options to produce compact (one-line) properties or omit IDs, and most importantly, generates a schema that fails to validate the data used to create it for single-schema arrays). Looking forward to trying your tool. – Dave Feb 17 '16 at 19:25
  • @Dave - i m too facing similar problems with json schema.net, did this python tool help ? – Cshah Feb 07 '17 at 13:44
  • 1
    @Cshah: I'm extremely impressed with GenSON and contributed a patch to it. I needed to generate more restrictive schemas than the author was comfortable with so I forked a version with options to generate pattern properties and additionalProperties / additionalItems so that unrecognized JSON data will be flagged as needing attention. – Dave Feb 14 '17 at 15:52
  • @Dave Why not raise adequate PRs into genson so we can all benefit? – Asclepius Aug 07 '17 at 19:04
  • patternProperties are now supported – wolverdude Jan 06 '18 at 20:07
  • As mentioned elsewhere, the old sites were definitely not good enough. [JSONSchema.Net](https://JSONSchema.Net) has now been rewritten. It's much more robust. If you have any issues, please report them on GitHub and I'll gladly fix them: https://github.com/jackwootton/json-schema – Jack Feb 21 '18 at 08:18
  • Well that was a shockingly smooth experience. Installed easily and did exactly what I needed it to do. Great solution! – James Madison Sep 15 '20 at 18:08
22

Seeing that this question is getting quite some upvotes, I add new information (I am not sure if this is new, but I couldn't find it at the time)

JaredMcAteer
  • 21,688
  • 5
  • 49
  • 65
blueFast
  • 41,341
  • 63
  • 198
  • 344
6

After several months, the best answer I have is my simple tool. It is raw but functional.

What I want is something similar to this. The JSON data can provide a skeleton for the JSON schema. I have not implemented it yet, but it should be possible to give an existing JSON schema as basis, so that the existing JSON schema plus JSON data can generate an updated JSON schema. If no such schema is given as input, completely default values are taken.

This would be very useful in iterative development: the first time the tool is run, the JSON schema is dummy, but it can be refined automatically according to the evolution of the data.

blueFast
  • 41,341
  • 63
  • 198
  • 344
  • 2
    Curious as to how @Green Su's suggestion didn't live up to your needs. I think you are describing a utility that provides jumpstarter (your term is 'skeletal') - something like a scaffolding code generator? – justSteve Aug 09 '12 at 02:20
  • 14
    Basically, the problem with that tool is that it is an *online* tool. I need it to run it locally in my development environment, sometimes automatically as part of other tasks. A "copy here, paste there" tool does not help me. If it had a REST API that would be good enough. – blueFast Aug 09 '12 at 09:18
  • 4
    @justSteve: the online tool, in addition to using a copy-paste workflow, still appears buggy (4 years after the original question). I have json objects for which the tool produces incorrect schemas but have not yet reduced them to minimal test cases to submit as bug reports. – Dave Feb 17 '16 at 19:55
5

There's a python tool to generate JSON Schema for a given JSON: https://github.com/perenecabuto/json_schema_generator

  • 6
    This is unmaintained since 2013. It doesn't support Python 3. Moreover, it only supports an older draft, i.e. `draft-03`. – Asclepius Aug 07 '17 at 16:35
5

generate-schema (NPM | Github) takes a JSON Object generates schemas from it, one output is JSON Schema, it's written in Node.js and comes with a REPL and ClI tool for piping files into.

Full Disclosure: I'm the author :)

Nijikokun
  • 1,514
  • 1
  • 15
  • 22
  • Any plans to update the module to draft 4+? Adding min, max attrs, references and so on? Thanks for the tool btw :) Will be using it in my Project – Mr. Alien Dec 23 '18 at 15:13
5

There's a nodejs tool which supports json schema v4 at https://github.com/krg7880/json-schema-generator

It works either as a command line tool, or as a nodejs library:

var jsonSchemaGenerator = require('json-schema-generator'),
    obj = { some: { object: true } },
    schemaObj;

schemaObj = jsonSchemaGenerator(json);
3

json-schema-generator is a neat Ruby based JSON schema generator. It supports both draft 3 and 4 of the JSON schema. It can be run as a standalone executable, or it can be embedded inside of a Ruby script.

Then you can use json-schema to validate JSON samples against your newly generated schema if you want.

HappyCoder86
  • 2,267
  • 3
  • 25
  • 40
3

For the offline tools that support multiple inputs, the best I've seen so far is https://github.com/wolverdude/GenSON/ I'd like to see a tool that takes filenames on standard input because I have thousands of files. However, I run out of open file descriptors, so make sure the files are closed. I'd also like to see JSON Schema generators that handle recursion. I am now working on generating Java classes from JSON objects in hopes of going to JSON Schema from my Java classes. Here is my GenSON script if you are curious or want to identify bugs in it.

#!/bin/sh
ulimit -n 4096
rm x3d*json
cat /dev/null > x3d.json
find ~/Downloads/www.web3d.org/x3d/content/examples -name '*json' -      print| xargs node goodJSON.js | xargs python bin/genson.py -i 2 -s     x3d.json >> x3d.json
split -p '^{' x3d.json x3d.json
python bin/genson.py -i 2 -s x3d.jsonaa -s x3d.jsonab /Users/johncarlson/Downloads/www.web3d.org/x3d/content/examples/X3dForWebAuthors/Chapter02-GeometryPrimitives/Box.json > x3dmerge.json 
John Carlson
  • 321
  • 1
  • 13
  • First, can you provide an answer to http://unix.stackexchange.com/questions/211803/which-version-of-split-supports-flag-p? – Dave Feb 17 '16 at 20:14
2

There are a lot of tools mentioned, but one more called JSON Schema inferencer for the record:

https://github.com/rnd0101/json_schema_inferencer

(it's not a library or a product, but a Python script)

With the usual Full Disclosure: I am the author.

Roman Susi
  • 4,135
  • 2
  • 32
  • 47
1

For node.js > 6.0.0 there is also the json-schema-by-example module.

Jerome WAGNER
  • 21,986
  • 8
  • 62
  • 77