In broad strokes you have 3 options.
But before we go through those, there are perfectly fine JSON parsing libraries out there. The standard org.json
one is extremely hard to use and not at all recommended, but there's GSON and Jackson-json. Open source, very widely used, and do all or most of this stuff already. Not sure why you want to reinvent this wheel. But, in the spirit of the question:
Let's work with an example:
{
"movies": [
{
"title": "A Few Good Men",
"director": {
"name": "Rob Reiner",
"dob": {
"year": 1947,
"month": 3,
"day": 6
}
},
"stars": [
"Demi Moore",
"Tom Cruise",
"Jack Nicholson"
]
}
]
}
Just cast em
This is what the org.json library does, and results in excessively verbose and hard to write code.
Your json.parse
method returns a JsonObject
, which is an abstract class and not particularly useful until you instanceof check what it actually is and cast from there.
If as a user of this library you know pretty much exactly how that data is structured and what you're looking for, in old timey java, you'd write:
JsonValue j = parse(jsonString);
JsonList movies = (JsonList) ((JsonObject) j).get("movies");
for (JsonValue m : movies) {
int birthYearOfDirector = ((JsonNumber) ((JsonObject) ((JsonObject) ((JsonObject) m).get("director")).get("dob")).get("year")).asInt();
}
which surely needs no further debate about how ridiculous that is.
With patterned switches, you have a few more options but it really doesn't get any better.
This style of parsing is considerably less convoluted if you do not know what you're looking for; for example, if you're writing a JSON UI widget that just renders it, this really isn't that bad.
The key problem
The problem is that JSON is untyped, but in java, we really need those types so that the parser can provide the data in the way the library user wants it. This means the structure and types of the JSON need to be provided externally.
Marshalling
One obvious way to do it, is by using simple java classes or records to represent this stuff. The library would work as follows:
public record Movies(List<Movie> movies) {}
public record Movie(String title, Director director, List<String> stars) {}
public record Director(String name, LocalDate dob) {}
// to use:
Movies movies = jsonParser.parse(jsonString, Movies.class);
Field names are public, record constructors are well defined, field types are public and 'reified' (you can look up the part in the <>
, that is not erased). This complicates the library as it has to go on a reflective spree to bind it all together but it can be done, and this is something popular JSON libraries such as GSON and Jackson offer. The same system can be used to turn an instance into a JSON. Some obvious concerns:
You really need a library of custom parsers; often LocalDate x;
should just work for some data in your JSON consisting of {"x": "1970-12-31"}
- GSON and co let you add 'parsers' and the like which are handed some JsonValue and a type (e.g. LocalDate), and this lets you handle such things. That does make it more complicated. You hardcode a few parsers (certainly all number types, string, and list, possibly (hash)map, need custom libraries).
You want the ability to add parse or format hints, such as wishing for a long
to nevertheless be rendered as a string (note that if you send your JSON to javascript, or even just get it parsed elsewhere given that JSON is defined in terms of javascript, all numbers are effectively double
, so the usual problems with doubles and rounding occurs. Trying to store a very large id, e.g. above 2^52, in a JSON number is a really bad idea as a consequence - whatever you send it to is extremely likely to round that). Annotations with RUNTIME level retention can do that.
Marshalling is entirely useless if the user of the library doesn't know the structure of the JSON you're working with (such as when writing a 'JSON viewer' UI widget). This is why GSON and Jackson offer both this and a more 'just cast-em' like library.
Replace get()
with asInt()
and friends.
Instead of forcing the user to check what a thing is (with instanceof
), or cast a thing to what they know it is, make methods for each type instead:
Json j = parse(jsonString);
for (Json m : j.get("movies").list()) {
int directorBirthYear = m.get("director").get("dob").get("year").asInt(0);
}
There is no JsonNumber
, JsonBoolean
, etcetera - instead there's just Json
(though you may have package-private subtypes that implement each JSON principal data type). .get()
just returns a wrapper object with the 'path' encoded in it (you don't want 'pathing' into non-existent spaces to throw exceptions), only asX()
does a look up and these methods come in many variants (asBoolean
, asStringList
, and so on), each variant overloaded: A no-args variant that throws if the path you are in doesn't exist. And an argsed variant that takes a default that is returned if the path does not exist. If the path does exist but the value you find there is fundamentally incompatible, it's a bit of a design decision if that means the default should be returned, or an exception should be thrown (e.g. if you call .asInt(100)
and the value there is "foobar"
, what now)?
Same 'what now?' design question arises if e.g. the input json has a "director"
k/v pair but its value is simply '18', instead of a JSON object that contains a key named "dob"
.
The point is, the vast majority of JSON out there tends to have known structure (known to the library user), but often omits data that cannot be supplied or isn't relevant, and usually you want to handle that by going with a default value. This path solution is compatible with that idea, and is highly efficient (in the 'simple to write and understand, succinct code' sense), if you know what you're looking for.
You can make an explicit JsonValue
type hierarchy and add an asValue()
method to cater to the 'UI widget that can view arbitrary JSON' use case if you need this.