5

Problem Description

I am writing a simple JSON analyzer in order to achieve syntax analysis of JSON strings. I am receiving random structure JSON strings and I want to export the syntax structure. As result, I want to get a tree structure which describes the format of the JSON string (key, value, arrays etc) and types of every element. I've already found the syntax definition of the JSON (described below)

object
    {}
    { members } 
members
    pair
    pair , members
pair
    string : value
array
    []
    [ elements ]
elements
    value
    value , elements
value
    string
    number
    object
    array
    true
    false
    null 

Example

JSON String:

{"widget": {
    "null": null,
    "window": {
         153: "This is string",
        "boolean": true,
        "int": 500,
        "float": 5.555
    }
}}    

And I want to get something like:

{ KEY_STR : {
     KEY_STR : null
     KEY_ARRAY : {
        KEY_INT: VALUE_STR,
        KEY_STR: VALUE_BOOL,
        KEY_STR: VALUE_INT,
        KEY_STR: VALUE_FLOAT
     }
}}

I am using JAVA with GSON library.

How I want to use that

I am interested to export the abstract tree in order to create my own messages automatically

My Question

I have started to implement that by using JsonParser. I am parsing the JSON object and then I am defining for every key and value the type. But I am wondering if I am on a good way or I am discovering the wheel. Is there anything already exist to export the abstract syntax tree or I should implement that by myself?

Ira Baxter
  • 93,541
  • 22
  • 172
  • 341
Manos Kirtas
  • 133
  • 1
  • 9
  • What's the point? "Create my own messages automatically" does not seem very viable based on the example output, as "KEY_STR" feels a bit broader than "widget", at least to me. – tevemadar Jul 18 '18 at 17:32
  • Btw: how do you plan representing the tree? That example with KEY_STR, etc. is not exactly Java. – tevemadar Jul 19 '18 at 16:12
  • hello @tevemadar, Maybe I didn't describe my problem so well because I am a bit confused. Let me explain the whole picture of that. I am receiving a bunch of JSON string with random structure. My first goal is to represent the JSON structure with a generic format. I want that because I want to iterate over the JSON values and use my own different input value. I thought that the Syntax tree it's a powerful representation in order to iterate over json. – Manos Kirtas Jul 19 '18 at 18:49
  • Also, I want to change the names/keys of the JSON in order to replace them with my own set of keys. With two words I just want to keep the structure of all those JSON and iterate over them with another set of values and key. Maybe a better approach is to create every time new JSON from existing on and replace whatever I want. For that reason I am looking [here](https://stackoverflow.com/questions/20442265/how-to-decode-json-with-unknown-field-using-gson) may it help – Manos Kirtas Jul 19 '18 at 18:49

1 Answers1

3

I have no idea what JsonParser will export. But in general, parsing something, then exporting an AST form from an AST data structure, reading the AST, and then extracting values from the read AST seems like just a lot of overhead goo to build and maintain.

What you should do is build the JSON parser into your application, parse the JSON to an AST data structure, and simply process that AST structure directly. Frankly, JSON is simple enough so you could write your own recursive descent parser to parse JSON and build the AST, leading you back to the first solution. See https://stackoverflow.com/a/2336769/120163

If you absolutely insist on exporting it, you can find tools that will do that off the shelf. Our DMS Software Reengineering Toolkit will do that, although it might be a bit heavyweight for this kind of application.

One of the nice things about JSON is the simple grammar. Here's the grammar that DMS uses:

-- JSON.atg: JSON domain grammar for DMS 
-- Copyright (C) 2011-2018 Semantic Designs, Inc.; All Rights Reserved
--
-- Update history:
--   Date         Initials   Changes Made
--   2011/09/02   CW         Created
--
-- Note that dangling commas in lists are off-spec
-- but I (CW) hate dealing with them
--
-- I'm not sure if JSON is supposed to have more than one entity per file
-- but I'm allowing it for robustness
--
-- TODO: name/value pair lists should be associative-commutative


JSON_text = ;
JSON_text = JSON_text object ;
JSON_text = JSON_text array ;

-- unordered set of name/value pairs
-- should be able to use an associative-commutative property directive
object = '{' name_value_pair_list '}' ;

-- empty production is for empty list, but will also allow multiple commas
name_value_pair_list = ;
name_value_pair_list = name_value_pair_list ',' ;
name_value_pair_list = name_value_pair_list name_value_pair ;

name_value_pair = STRING ':' value ;

-- ordered collection of values
array = '[' value_list ']' ;
value_list = value ;
value_list = value_list ',' value ;
value_list = value_list ',' value ',' ;

value = STRING ;
value = NUMBER_INT ;
value = NUMBER_FLOAT ;
value = object ;
value = array ;
value = 'true' ;
value = 'false' ;
value = 'null' ;

Yes, it almost exactly matches the abstract grammar provided by the OP.

Now, with that, you can ask DMS to parse a file and export its AST with this command:

run ..\DomainParser +AST  ..\..\..\Examples\One.js

For the JSON file One.js, containing this text:

{
"from": "http://json.org/example.html"
}

{
"glossary": {
    "title": "example glossary",
        "GlossDiv": {
    "title": "S",
            "GlossList": {
        "GlossEntry": {
        "ID": "SGML",
                    "SortAs": "SGML",
                    "GlossTerm": "Standard Generalized Markup Language",
                    "Acronym": "SGML",
                    "Abbrev": "ISO 8879:1986",
                    "GlossDef": {
            "para": "A meta-markup language, used to create markup languages such as DocBook.",
                        "GlossSeeAlso": ["GML", "XML"]
        },
                    "GlossSee": "markup"
        }
    }
    }
}
}
<rest of file snipped>

The parser produces an S-expression:

(JSON_text@JSON=2#59406e0^0 Line 1 Column 1 File C:/DMS/Domains/JSON/Examples/One.js
 (JSON_text@JSON=2#2199f60^1#59406e0:1 Line 1 Column 1 File C:/DMS/Domains/JSON/Examples/One.js
  (JSON_text@JSON=2#21912a0^1#2199f60:1 Line 1 Column 1 File C:/DMS/Domains/JSON/Examples/One.js
   (JSON_text@JSON=2#593df00^1#21912a0:1 Line 1 Column 1 File C:/DMS/Domains/JSON/Examples/One.js
   |(JSON_text@JSON=2#593d420^1#593df00:1 Line 1 Column 1 File C:/DMS/Domains/JSON/Examples/One.js
   | (JSON_text@JSON=2#593c580^1#593d420:1 Line 1 Column 1 File C:/DMS/Domains/JSON/Examples/One.js
   |  (JSON_text@JSON=1#593bec0^1#593c580:1 Line 1 Column 1 File C:/DMS/Domains/JSON/Examples/One.js)JSON_text
   |  (object@JSON=4#593c560^1#593c580:2 Line 1 Column 1 File C:/DMS/Domains/JSON/Examples/One.js
   |   (name_value_pair_list@JSON=7#593c520^1#593c560:1 Line 2 Column 5 File C:/DMS/Domains/JSON/Examples/One.js
   |   |(name_value_pair_list@JSON=5#593c420^1#593c520:1 Line 2 Column 5 File C:/DMS/Domains/JSON/Examples/One.js)name_value_pair_list
   |   |(name_value_pair@JSON=8#593c4e0^1#593c520:2 Line 2 Column 5 File C:/DMS/Domains/JSON/Examples/One.js
   |   | (STRING@JSON=24#593c400^1#593c4e0:1[`from'] Line 2 Column 5 File C:/DMS/Domains/JSON/Examples/One.js)STRING
   |   | (STRING@JSON=24#593c480^1#593c4e0:2[`http://json.org/example.html'] Line 2 Column 13 File C:/DMS/Domains/JSON/Examples/One.js)STRING
   |   |)name_value_pair#593c4e0
   |   )name_value_pair_list#593c520
   |  )object#593c560
   | )JSON_text#593c580
   | (object@JSON=4#593d400^1#593d420:2 Line 5 Column 1 File C:/DMS/Domains/JSON/Examples/One.js
   |  (name_value_pair_list@JSON=7#593d3c0^1#593d400:1 Line 6 Column 5 File C:/DMS/Domains/JSON/Examples/One.js
   |   (name_value_pair_list@JSON=5#593c5c0^1#593d3c0:1 Line 6 Column 5 File C:/DMS/Domains/JSON/Examples/One.js)name_value_pair_list
   |   (name_value_pair@JSON=8#593d380^1#593d3c0:2 Line 6 Column 5 File C:/DMS/Domains/JSON/Examples/One.js
   |   |(STRING@JSON=24#593c5a0^1#593d380:1[`glossary'] Line 6 Column 5 File C:/DMS/Domains/JSON/Examples/One.js)STRING
   |   |(object@JSON=4#593d360^1#593d380:2 Line 6 Column 17 File C:/DMS/Domains/JSON/Examples/One.js
   |   | (name_value_pair_list@JSON=7#593d340^1#593d360:1 Line 7 Column 9 File C:/DMS/Domains/JSON/Examples/One.js
   |   |  (name_value_pair_list@JSON=6#593c720^1#593d340:1 Line 7 Column 9 File C:/DMS/Domains/JSON/Examples/One.js
   |   |   (name_value_pair_list@JSON=7#593c6c0^1#593c720:1 Line 7 Column 9 File C:/DMS/Domains/JSON/Examples/One.js
   |   |   |(name_value_pair_list@JSON=5#593c600^1#593c6c0:1 Line 7 Column 9 File C:/DMS/Domains/JSON/Examples/One.js)name_value_pair_list
   |   |   |(name_value_pair@JSON=8#593c640^1#593c6c0:2 Line 7 Column 9 File C:/DMS/Domains/JSON/Examples/One.js
   |   |   | (STRING@JSON=24#593c5e0^1#593c640:1[`title'] Line 7 Column 9 File C:/DMS/Domains/JSON/Examples/One.js)STRING
   |   |   | (STRING@JSON=24#593c620^1#593c640:2[`example glossary'] Line 7 Column 18 File C:/DMS/Domains/JSON/Examples/One.js)STRING
   |   |   |)name_value_pair#593c640
   |   |   )name_value_pair_list#593c6c0
   |   |  )name_value_pair_list#593c720
   |   |  (name_value_pair@JSON=8#593d320^1#593d340:2 Line 8 Column 17 File C:/DMS/Domains/JSON/Examples/One.js
   |   |   (STRING@JSON=24#593c700^1#593d320:1[`GlossDiv'] Line 8 Column 17 File C:/DMS/Domains/JSON/Examples/One.js)STRING
   |   |   (object@JSON=4#593d300^1#593d320:2 Line 8 Column 29 File C:/DMS/Domains/JSON/Examples/One.js
   |   |   |(name_value_pair_list@JSON=7#593d2e0^1#593d300:1 Line 9 Column 13 File C:/DMS/Domains/JSON/Examples/One.js
   |   |   | (name_value_pair_list@JSON=6#593c880^1#593d2e0:1 Line 9 Column 13 File C:/DMS/Domains/JSON/Examples/One.js
   |   |   |  (name_value_pair_list@JSON=7#593c820^1#593c880:1 Line 9 Column 13 File C:/DMS/Domains/JSON/Examples/One.js
   |   |   |   (name_value_pair_list@JSON=5#593c760^1#593c820:1 Line 9 Column 13 File C:/DMS/Domains/JSON/Examples/One.js)name_value_pair_list
   |   |   |   (name_value_pair@JSON=8#593c7e0^1#593c820:2 Line 9 Column 13 File C:/DMS/Domains/JSON/Examples/One.js
   |   |   |   |(STRING@JSON=24#593c740^1#593c7e0:1[`title'] Line 9 Column 13 File C:/DMS/Domains/JSON/Examples/One.js)STRING
   |   |   |   |(STRING@JSON=24#593c780^1#593c7e0:2[`S'] Line 9 Column 22 File C:/DMS/Domains/JSON/Examples/One.js)STRING
   |   |   |   )name_value_pair#593c7e0
   |   |   |  )name_value_pair_list#593c820
   |   |   | )name_value_pair_list#593c880
   |   |   | (name_value_pair@JSON=8#593d2c0^1#593d2e0:2 Line 10 Column 25 File C:/DMS/Domains/JSON/Examples/One.js
   |   |   |  (STRING@JSON=24#593c860^1#593d2c0:1[`GlossList'] Line 10 Column 25 File C:/DMS/Domains/JSON/Examples/One.js)STRING
   |   |   |  (object@JSON=4#593d2a0^1#593d2c0:2 Line 10 Column 38 File C:/DMS/Domains/JSON/Examples/One.js
   |   |   |   (name_value_pair_list@JSON=7#593d280^1#593d2a0:1 Line 11 Column 17 File C:/DMS/Domains/JSON/Examples/One.js
   |   |   |   |(name_value_pair_list@JSON=5#593c8c0^1#593d280:1 Line 11 Column 17 File C:/DMS/Domains/JSON/Examples/One.js)name_value_pair_list
   |   |   |   |(name_value_pair@JSON=8#593d260^1#593d280:2 Line 11 Column 17 File C:/DMS/Domains/JSON/Examples/One.js
   |   |   |   | (STRING@JSON=24#593c8a0^1#593d260:1[`GlossEntry'] Line 11 Column 17 File C:/DMS/Domains/JSON/Examples/One.js)STRING
   |   |   |   | (object@JSON=4#593d240^1#593d260:2 Line 11 Column 31 File C:/DMS/Domains/JSON/Examples/One.js
   |   |   |   |  (name_value_pair_list@JSON=7#593d200^1#593d240:1 Line 12 Column 21 File C:/DMS/Domains/JSON/Examples/One.js
   |   |   |   |   (name_value_pair_list@JSON=6#593d160^1#593d200:1 Line 12 Column 21 File C:/DMS/Domains/JSON/Examples/One.js
   |   |   |   |   |(name_value_pair_list@JSON=7#593d120^1#593d160:1 Line 12 Column 21 File C:/DMS/Domains/JSON/Examples/One.js
   |   |   |   |   | (name_value_pair_list@JSON=6#593cde0^1#593d120:1 Line 12 Column 21 File C:/DMS/Domains/JSON/Examples/One.js
   |   |   |   |   |  (name_value_pair_list@JSON=7#593cd60^1#593cde0:1 Line 12 Column 21 File C:/DMS/Domains/JSON/Examples/One.js
   |   |   |   |   |   (name_value_pair_list@JSON=6#593cca0^1#593cd60:1 Line 12 Column 21 File C:/DMS/Domains/JSON/Examples/One.js
   |   |   |   |   |   |(name_value_pair_list@JSON=7#593cc60^1#593cca0:1 Line 12 Column 21 File C:/DMS/Domains/JSON/Examples/One.js
   |   |   |   |   |   | (name_value_pair_list@JSON=6#593cc00^1#593cc60:1 Line 12 Column 21 File C:/DMS/Domains/JSON/Examples/One.js
   |   |   |   |   |   |  (name_value_pair_list@JSON=7#593cb80^1#593cc00:1 Line 12 Column 21 File C:/DMS/Domains/JSON/Examples/One.js
   |   |   |   |   |   |   (name_value_pair_list@JSON=6#593cb00^1#593cb80:1 Line 12 Column 21 File C:/DMS/Domains/JSON/Examples/One.js
   |   |   |   |   |   |   |(name_value_pair_list@JSON=7#593cac0^1#593cb00:1 Line 12 Column 21 File C:/DMS/Domains/JSON/Examples/One.js
   |   |   |   |   |   |   | (name_value_pair_list@JSON=6#593ca60^1#593cac0:1 Line 12 Column 21 File C:/DMS/Domains/JSON/Examples/One.js
   |   |   |   |   |   |   |  (name_value_pair_list@JSON=7#593ca00^1#593ca60:1 Line 12 Column 21 File C:/DMS/Domains/JSON/Examples/One.js
   |   |   |   |   |   |   |   (name_value_pair_list@JSON=5#593c900^1#593ca00:1 Line 12 Column 21 File C:/DMS/Domains/JSON/Examples/One.js)name_value_pair_list
   |   |   |   |   |   |   |   (name_value_pair@JSON=8#593c9c0^1#593ca00:2 Line 12 Column 21 File C:/DMS/Domains/JSON/Examples/One.js
   |   |   |   |   |   |   |   |(STRING@JSON=24#593c8e0^1#593c9c0:1[`ID'] Line 12 Column 21 File C:/DMS/Domains/JSON/Examples/One.js)STRING
   |   |   |   |   |   |   |   |(STRING@JSON=24#593c920^1#593c9c0:2[`SGML'] Line 12 Column 27 File C:/DMS/Domains/JSON/Examples/One.js)STRING
   |   |   |   |   |   |   |   )name_value_pair#593c9c0
   |   |   |   |   |   |   |  )name_value_pair_list#593ca00
   |   |   |   |   |   |   | )name_value_pair_list#593ca60
   |   |   |   |   |   |   | (name_value_pair@JSON=8#593caa0^1#593cac0:2 Line 13 Column 41 File C:/DMS/Domains/JSON/Examples/One.js
   |   |   |   |   |   |   |  (STRING@JSON=24#593ca40^1#593caa0:1[`SortAs'] Line 13 Column 41 File C:/DMS/Domains/JSON/Examples/One.js)STRING
   |   |   |   |   |   |   |  (STRING@JSON=24#593ca80^1#593caa0:2[`SGML'] Line 13 Column 51 File C:/DMS/Domains/JSON/Examples/One.js)STRING
   |   |   |   |   |   |   | )name_value_pair#593caa0
   |   |   |   |   |   |   |)name_value_pair_list#593cac0

I've truncated the output because nobody really wants to see the tree. Now, there's a lot of "extra" stuff in tree such as node location, source line numbers, which can all be easily eliminated or ignored.

Ira Baxter
  • 93,541
  • 22
  • 172
  • 341