3

Say I have this struct:

use serde::{Serialize, Deserialize};

#[derive(Deserialize)]
struct MyStruct {
    field_1: Option<usize>, // should only have field_1 or field_2
    field_2: Option<usize>, // should only have field_1 or field_2
    other_field: String,
}

How can I deserialize this but only allow one of these fields to exist?

cactus
  • 357
  • 4
  • 16
  • 3
    Use an enum, I guess. – Dev611 Nov 04 '21 at 04:34
  • Are the two fields mutually exclusive? Or, are there cases where they are both present? – Joe_Jingyu Nov 04 '21 at 05:19
  • @Joe_Jingyu they should be mutually exclusive, I want to stop cases where they might be attempted to be both included. – cactus Nov 04 '21 at 05:23
  • If so, as @Dev611 also advised, using an `enum` instead of `struct` looks the good way to go. – Joe_Jingyu Nov 04 '21 at 05:28
  • I should've included in the example but there will be other fields in the `struct` also – cactus Nov 04 '21 at 05:29
  • Then, is it acceptable to replace the two fields with a field of the enum type which has field_1 and field_2 as its two variants? if not, I don't think it is possible to hide a field of a struct. Maybe you can consider the solution in the [post](https://stackoverflow.com/questions/44331037/how-can-i-distinguish-between-a-deserialized-field-that-is-missing-and-one-that) to use an enum to distinguish the tri-states: `Some(value)`, `None` and missing. – Joe_Jingyu Nov 04 '21 at 05:47

3 Answers3

4

The suggestions in the comments to use an enum are likely your best bet. You don't need to replace your struct with an enum, instead you'd add a separate enum type to represent this constraint, e.g.:

use serde::{Serialize, Deserialize};

#[derive(Deserialize)]
enum OneOf {
    F1(usize), F2(usize)
}

#[derive(Deserialize)]
struct MyStruct {
    one_of_field: OneOf,
    other_field: String,
}

Now MyStruct's one_of_field can be initialized with either an F1 or an F2.

dimo414
  • 47,227
  • 18
  • 148
  • 244
3

@dimo414's answer is correct about the enum being necessary, but the code sample will not function in the way the question is described. This is caused by a couple factors related to how enums are deserialized. Mainly, it will not enforce mutual exclusion of the two variants and will silently pick the first variant that matches and ignore extraneous fields. Another issue is the enum will be treated as a separate structure within MyStruct (Ex: {"one_of_field":{"F1":123},"other_field":"abc"} in JSON).

Solution

For anyone wanting an easy solution, here it is. However keep in mind that variants of the mutually exclusive type can not contain #[serde(flatten)] fields (more information in the issue section). To accommodate neither field_1 or field_2, Option<MutuallyExclusive> can be used in MyStruct.

/// Enum containing mutually exclusive fields. Variants names will be used as the
/// names of fields unless annotated with `#[serde(rename = "field_name")]`.
#[derive(Deserialize)]
enum MutuallyExclusive {
    field_1(usize),
    field_2(usize),
}

#[derive(Deserialize)]
/// `deny_unknown_fields` is required. If not included, it will not error when both
/// `field_1` and `field_2` are both present.
#[serde(deny_unknown_fields)]
struct MyStruct {
    /// Flatten makes it so the variants of MutuallyExclusive are seen as fields of 
    /// this struct. Without it, foo would be treated as a separate struct/object held
    /// within this struct.
    #[serde(flatten)]
    foo: MutuallyExclusive,
    other_field: String,
}

The Issue

TL;DR: It should be fine to use deny_unknown_fields with flatten in this way so long as types used in MutuallyExclusive do not use flatten.

If you read the serde documentation, you may notice it warns that using deny_unknown_fields in conjunction with flatten is unsupported. This is problematic as it throws the long-term reliability of the above code into question. As of writing this, serde will not produce any errors or warnings about this configuration and is able to handle it as intended.

The pull request adding this warning cited 3 issues when doing so:

To be honest, I don't really care about the first one. I personally feel it is a bit overly pedantic. It simply states that the error message is not exactly identical between a type and a wrapper for that type using flatten for errors triggered by deny_unknown_fields. This should not have any effect on the functionality of the code above.

However, the other two errors are relevant. They both relate to nested flatten types within a deny_unknown_fields type. Technically the second issue uses untagged for the second layer of nesting, but it has the same effect as flatten in this context. The main idea is that deny_unknown_fields is unable to handle more than a single level of nesting without causing issues. The use case is in any way at fault, but the way deny_unknown_fields and flattened are handled makes it difficult to implement a workaround.

Alternative

However, if anyone still feels uncomfortable with using the above code, you can use this version instead. It will be a pain to work with if there are a lot of other fields, but sidesteps the warning in the documentation.

#[derive(Debug, Deserialize)]
#[serde(untagged, deny_unknown_fields)]
enum MyStruct {
    WithField1 {
        field_1: usize,
        other_field: String,
    },
    WithField2 {
        field_2: usize,
        other_field: String,
    },
}
Locke
  • 7,626
  • 2
  • 21
  • 41
  • Can you share a Rust Playground example of Serde silently picking the first matching variant? – dimo414 Nov 27 '22 at 05:10
  • @dimo414 this was a couple months ago so I do not have the original test cases. If I remember correctly, this situation only occurs when an input can satisfy multiple variants. Since `serde` will silently ignore fields that are not present on the destination structure (unless you use `deny_unknown_fields`) it is able to stop after successfully parsing the first variant. Generally this is helpful since it lets you ignore fields that are not relevant to your use-case, but fields that are truly mutually exclusive should produce an error if both are present at the same time. – Locke Nov 27 '22 at 09:33
  • I'm not clear why there's an issue with unknown fields in this circumstance; both variants of the enum are known to the deserializer, so I'm surprised it would consider the second variant "unknown". – dimo414 Nov 27 '22 at 22:49
  • @dimo414, unknown fields simply refers to fields which are unused while parsing. If it received `{"field_1":3,"field_2":4}`, then it would find `"field_2":4` to be unknown for the variant `field_1(usize)` and vice versa. Since both variants result in unused fields it is unable to satisfy `MutuallyExclusive` and an error is produced. – Locke Nov 27 '22 at 23:19
  • 1
    @dimo414, here is a [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=89035505eac98a1ee69630170fbf7a35) you can use to try different solutions. So far the solution I posted has been the best one I have found, but it is not perfect. Ideally it would enforce mutual exclusion on the two fields without requiring `deny_unknown_fields` be enforced on the entire object. I would be interested to see if you (or anyone else) finds a solution to this problem. – Locke Nov 27 '22 at 23:38
1

You can deserialize your struct and then verify that all the invariants your type should uphold. You can implement Deserialize for your type to this while also relying on the derive macro to do the heavy lifting.

use serde::{Deserialize, Deserializer};

#[derive(Debug, Deserialize)]
#[serde(remote = "Self")]
struct MyStruct {
    field_1: Option<usize>, // should only have field_1 or field_2
    field_2: Option<usize>, // should only have field_1 or field_2
    other_field: String,
}

impl<'de> Deserialize<'de> for MyStruct {
    fn deserialize<D: Deserializer<'de>>(deserializer: D) -> Result<Self, D::Error> {
        use serde::de::Error;
        
        let s = Self::deserialize(deserializer)?;
        if s.field_1.is_some() && s.field_2.is_some() {
            return Err(D::Error::custom("should only have field_1 or field_2"));
        }
        
        Ok(s)
    }
}

fn main() -> () {
    dbg!(serde_json::from_value::<MyStruct>(serde_json::json!({
        "field_1": 123,
        "other_field": "abc"
    })));
    dbg!(serde_json::from_value::<MyStruct>(serde_json::json!({
        "field_2": 456,
        "other_field": "abc"
    })));
    dbg!(serde_json::from_value::<MyStruct>(serde_json::json!({
        "field_1": 123,
        "field_2": 456,
        "other_field": "abc"
    })));
}

Playground

jonasbb
  • 2,131
  • 1
  • 6
  • 25