1

I have a huge collection of JSON files, extracted from a public API. I'm trying to get the schema of the object the files are describing (key/type of value) thanks to Node.js, the same way schemas are described in Mongoose, e.g. :

schema = {
  key1: 'Number',
  key2: 'Boolean',
  key3: {
    key31: 'String',
    key32: 'Boolean'
  },
  key4: [{
    key41: 'String',
    key42: 'Number'
  }]
};

In order to do this, I wrote a script which take the first file, get the keys and type of values (recursively, as the object has sub-objects) ; then compare the second file to the first one and add some data if the first file was missing some keys, and so on... Maybe not the best solution to achieve what I want to do, but it (almost) does the work.

Unfortunately, after running the script, all the types of array of sub-objects (key41 and key42 in example above) are shown as string, although I have some boolean,number ...

Here's what I did from now on :

var glob = require('glob'),
    _ = require('lodash');

function getSchema(from, to) {
  for (var kay in from) {
    if (from.hasOwnProperty(key)) {
      if (_.isObject(from[key])) {
        //_.isObject() returns true for arrays
        if (_.isArray(from[key])) {
          if (!to.hasOwnProperty(key)) {
            to[key] = [{}];
          }
          if (from[key].length) {
            var tmp = {};
            for (var i = 0; i < from[key].length; i++) {
              getSchema(from[key][i], tmp);
            }
            getSchema(tmp, to[key][0]);
          }
        } else {
          if (!to.hasOwnProperty(key)) {
            to[key] = {};
          }
          getSchema(from[key], to[key]);
        }
      } else {
        to[key] = typeof from[key];
      }
    }
  }
};

glob('./data/**/*.json', {}, function(err, files) {
  if (err) console.log(err);
  var schema = {};
  _.forEach(files, function(file) {
    var data = require(file);
    getSchema(data, schema);
  });
  //schema should be an object, mirroring the schema of the JSON files
});

Can someone help me fixing this ?

Thanks in advance.

leroydev
  • 2,915
  • 17
  • 31
aknorw
  • 288
  • 1
  • 2
  • 11
  • I have not read through your code deeply, but my guess is that it is reading the keys of the objects in the array as the values. So it reads 'key41', and sees that it is typeof 'string'. I would look there first. – dz210 Jan 26 '16 at 14:49
  • Maybe this question can be helpful? http://stackoverflow.com/questions/7341537/tool-to-generate-json-schema-from-json-data – Antonio Jan 26 '16 at 14:50
  • @dz210 I guess that's the issue here : if I replace `to[key] = typeof from[key]` by `to[key] = from[key]`, I'm getting the right type of data of the first value found. @Antonio Thanks for the useful link. – aknorw Jan 26 '16 at 15:00

1 Answers1

0

Your final statement,

to[key] = typeof from[key];

should be

to[key] = from[key]

unless I am reading this wrong and the example key:value pairs contains actual values for the 'value' and not the typeof.

dz210
  • 748
  • 1
  • 8
  • 20
  • Your solution does not provide the type of value for a key but the first value found for the key. The example given is showing what I want to achieve. – aknorw Jan 26 '16 at 15:01