3

I've an API that requires the object's fields to be sorted alphabetically because the struct has to be hashed.

In Java/Jackson, you can set a flag in the serializer: MapperFeature.SORT_PROPERTIES_ALPHABETICALLY. I can't find anything similar in Serde.

I'm using rmp-serde (MessagePack). It follows the annotations and serialization process used for JSON, so I thought that it would be fully compatible, but the sorting provided by @jonasbb doesn't work for it.

The struct has (a lot of) nested enums and structs which have to be flattened for the final representation. I'm using Serialize::serialize for that, but calling state.serialize_field at the right place (such that everything is alphabetical) is a pain because the enums need a match clause, so it has to be called multiple times for the same field in different places and the code is very difficult to follow.

As possible solutions, two ideas:

  1. Create a new struct with the flat representation and sort the fields alphabetically manually.

    This is a bit error prone, so a programmatic sorting solution for this flattened struct would be great.

  2. Buffer the key values in Serialize::serialize (e.g. in a BTreeMap, which is sorted), and call state.serialize_field in a loop at the end.

    The problem is that the values seem to have to be of type Serialize, which isn't object safe, so I wasn't able to figure out how to store them in the map.

How to sort HashMap keys when serializing with serde? is similar but not related because my question is about the sorting of the struct's fields/properties.

user
  • 39
  • 3
  • If they are arrays, just pre sort them. If they are object like, it doesnt really matter becase the deserialization object can arbitrary un-order them depending on what is it. – Netwave Jun 01 '21 at 12:57
  • @Netwave it's object-like and it does matter because the endpoint I'm interacting with requires them to be sorted alphabetically. – user Jun 01 '21 at 13:15
  • @Netwave The most common C# json framework also requires "$type" to be first, but this is only an issue if you can't modify the receiver (or the receiver needs high performance) and if you need to have dynamic types in the data structure. My point is it's not unheard of for the order to matter. – piojo Jun 01 '21 at 16:04

1 Answers1

4

You are not writing which data format you are targetting. This makes it hard to find a solution, since some might not work in all cases.

This code works if you are using JSON (unless the preserve_order feature flag is used). The same would for for TOML by serializing into toml::Value as intermediate step. The solution will also work for other data formats, but it might result in a different serialization, for example, emitting the data as a map instead of struct-like.

fn sort_alphabetically<T: Serialize, S: serde::Serializer>(value: &T, serializer: S) -> Result<S::Ok, S::Error> {
    let value = serde_json::to_value(value).map_err(serde::ser::Error::custom)?;
    value.serialize(serializer)
}

#[derive(Serialize)]
struct SortAlphabetically<T: Serialize>(
    #[serde(serialize_with = "sort_alphabetically")]
    T
);

#[derive(Serialize, Deserialize, Default, Debug)]
struct Foo {
    z: (),
    bar: (),
    ZZZ: (),
    aAa: (),
    AaA: (),
}

println!("{}", serde_json::to_string_pretty(&SortAlphabetically(&Foo::default()))?);

because the struct has to be hashed

While field order is one source of indeterminism there are other factors too. Many formats allow different amounts of whitespace or different representations like Unicode escapes \u0066.

jonasbb
  • 2,131
  • 1
  • 6
  • 25
  • That's magic! don't see anything with sorting aside of the self chosen names.. :D I didn't provide some infos, sorry. The first is that I'm using `rmp_serde`. I attempted to use it in `sort_alphabetically` (with `to_vec_named`, which doesn't really make sense), it compiled but the output is not sorted. – user Jun 01 '21 at 17:12
  • The second (which probably should get a new post) is that the struct to be serialized is a bit complex, it has enums with nested structs and everything needs to be flattened in the JSON. This manual serialization is working already (`impl Serialize`). But it's very cumbersome to order the fields correctly manually, because of all the nested enums and the flattening. I thought about using a `BTreeMap` as a buffer but wasn't able to set the value type to `Serialize`, because it's not object safe. Not sure if it's a good idea in general. – user Jun 01 '21 at 17:16
  • Concerning the hashing, it's a requirement of a "third party" backend I'm working with. What would you recommend? – user Jun 01 '21 at 17:20
  • 2
    `serde_json::Value::Map` internally uses a `BTreeMap`, thus the keys become sorted. The same for `toml`. I am unfamiliar with rmp so I don't know if the mentioned problems occur there too. A vague mention of hashing is not really enough to give tips. – jonasbb Jun 01 '21 at 18:40
  • 1
    In situations where the `HashMap` is "deep within your data tree", such that wrapping it in a `SortAlphabetically` struct is inconvenient, there is an alternative solution here that doesn't require that wrapper-struct: https://stackoverflow.com/a/42723390 – Venryx Mar 31 '22 at 15:05