2022-04-27

Additive deserializing with Serde

I'd like to additively deserialize multiple files over the same data structure, where "additively" means that each new file deserializes by overwriting the fields that it effectively contains, leaving unmodified the ones that it does not. The context is config files; deserialize an "app" config provided by the app, then override it with a per-"user" config file.

I use "file" hear for the sake of clarity; this could be any deserializing data source.

Note: After writing the below, I realized maybe the question boils down to: is there a clever use of #[serde(default = ...)] to provide a default from an existing data structure? I'm not sure if that's (currently) possible.

Example

Data structure

struct S {
  x: f32,
  y: String,
}

"App" file (using JSON for example):

{ "x": 5.0, "y": "app" }

"User" file overriding only "y":

{ "y": "user" }

Expected deserializing (app, then user):

assert_eq!(s.x, 5.0);
assert_eq!(s.y, "user");

Expected solution

  • I'm ignoring on purpose any "dynamic" solution storing all config settings into, say, a single HashMap; although this works and is flexible, this is fairly inconvenient to use at runtime, and potentially slower. So I'm calling this approach out of scope for this question.
  • Data structure can contain other structs. Avoid having to write too many per-struct code manually (like implementing Deserialize by hand). A typical config file for a moderate-sized app can contains hundreds of settings, I don't want the burden of having to maintain those.
  • All fields can be expected to implement Default. The idea is that the first deserialized file would fallback on Default::default() for all missing fields, while subsequent ones would fallback on already-existing values if not explicitly overridden in the new file.
  • Avoid having to change every single field of every single struct to Option<T> just for the sake of serializing/deserializing. This would make runtime usage very painful, where due to above property there would anyway be no None value ever once deserialization completed (since, if a field is missing from all files, it defaults to Default::default() anyway).
  • I'm fine with a solution containing only a fixed number (2) of overriding files ("app" and "user" in example above).

Current partial solution

I know how to do the first part of falling back to Default; this is well documented. Simply use #[serde(default)] on all structs.

One approach would be to simply deserialize twice with #[serde(default)] and override any field which is equal to its default in the app config with its value in the user config. But this 1) probably requires all fields to implement Eq or PartialEq, and 2) is potentially expensive and not very elegant (lose the info during deserialization, then try to somehow recreate it).

I have a feeling I possibly need a custom Deserializer to hold a reference/value of the existing data structure, which I would fallback to when a field is not found, since the default one doesn't provide any user context when deserializing. But I'm not sure how to keep track of which field is currently being deserialized.

Any hint or idea much appreciated, thanks!



No comments:

Post a Comment