Resolving "$ref" in serde_json/serde_yaml

Question

What is the best way to recursively resolve all JSON $ref in JSON documents?

A direct approach is to create a Deserialize implementation that accepts either a {"$ref"} object or a desired value, say an enum like this:

enum JRef<T> {
    Value(T),
    Ref(URI),
}

impl<'de, T> Deserialize<'de> for JRef<T> { ... }

But this would require me to wrap all my types in JRef<T>, so my derive(Deserialize) schema struct is filled with the boilerplate:

#[derive(Deserialize)]
struct Foo {
    a: JRef<Bar>,
    b: JRef<(JRef<Qux>, JRef<Corge>)>,
    ...
}

In addition, the resolution of the reference is also boilerplate.

I understand that the direct deserialization of a type should not depend on some external state, so it makes sense to first deserialize into a JRef and then resolve them later. But to prevent writing boilerplate code, the resolution performed later could be performed using a custom proc-macro that derives a JRef-free-type of the original type, which implement a trait that accepts a foreign function and resolves URIs using the foreign function.

So far, I have dug deep into the dependency ecosystem in order to just resolve a single $ref. Do I really need to do all these, or is there a simpler solution I have missed?

That is a *really* loaded question. I also do not think it is a good place to do that on the `Deserialize` step - it is probably better to do the fetching one step before, since by that point you'll have JSON objects (of whatever library you use) instead of a single `{ "$ref": "... }` key, which makes `Deserialize` work out of the box. — Sébastien Renauld, Sep 11 '19 at 13:16
You don't really know what references you want to load before the deserialization. In some schemas, not every single `{"$ref"}` would need to be resolved. Parsing with serde_json might be necessary to locate the exact references that need to be resolved. — SOFe, Sep 11 '19 at 13:29
@user3054986 I used a postprocessing step after the serde parsing. It seems that serde doesn't really like stateful conversion. On an unrelated note, I also developed [xylem](https://github.com/SOF3/xylem), a stateful conversion framework, which is designed to serve as a post-serde processing stage. — SOFe, Nov 19 '22 at 14:30

score 0 · Answer 1 · answered Nov 19 '22 at 10:38

I think: It depends. What is the further purpose of your code?

Do you just need deserialization for code generation?
Are you modifying the JSON and serialize it again?

For the first case, there is a good example: schemafy A) In schemafy_lib/src/schema.rs the schema to be serialized can be found. It contains a $ref, of type Option. B) Further, schemafy_lib/src/lib.rs contains an Expander, transforming the $ref into Rust source code. (At least that is my understanding.)

I have to deal with the latter case - serializing as well as deserializing. From the first example a take A and skip B. I plan to put my Schema into an object, hiding the $ref while clients may deals with JSON objects transparently.

Comments are welcome.

Resolving "$ref" in serde_json/serde_yaml

1 Answers1