How to represent different entities that have identical behavior?

Question

I have several different entities in my domain model (animal species, let's say), which have a few properties each. The entities are readonly (they do not change state during the application lifetime) and they have identical behavior (the differ only by the values of properties).

How to implement such entities in code?

Unsuccessful attempts:

Enums

I tried an enum like this:

enum Animals {
    Frog,
    Duck,
    Otter,
    Fish  
}

And other pieces of code would switch on the enum. However, this leads to ugly switching code, scattering the logic around and problems with comboboxes. There's no pretty way to list all possible Animals. Serialization works great though.

Subclasses

I also thought about where each animal type is a subclass of a common base abstract class. The implementation of Swim() is the same for all Animals, though, so it makes little sense and serializability is a big issue now. Since we represent an animal type (species, if you will), there should be one instance of the subclass per application, which is hard and weird to maintain when we use serialization.

public abstract class AnimalBase {
    string Name { get; set; } // user-readable
    double Weight { get; set; }
    Habitat Habitat { get; set; }
    public void Swim(); { /* swim implementation; the same for all animals but depends                  uses the value of Weight */ }
}

public class Otter: AnimalBase{
    public Otter() {
        Name = "Otter";
        Weight = 10;
        Habitat = "North America";
    }
}

// ... and so on

Just plain awful.

Static fields

This blog post gave me and idea for a solution where each option is a statically defined field inside the type, like this:

public class Animal {
   public static readonly Animal Otter = 
       new Animal 
       { Name="Otter", Weight = 10, Habitat = "North America"}
   // the rest of the animals...

   public string Name { get; set; } // user-readable
   public double Weight { get; set; }
   public Habitat Habitat { get; set; }

   public void Swim();

}

That would be great: you can use it like enums (AnimalType = Animal.Otter), you can easily add a static list of all defined animals, you have a sensible place where to implement Swim(). Immutability can be achieved by making property setters protected. There is a major problem, though: it breaks serializability. A serialized Animal would have to save all its properties and upon deserialization it would create a new instance of Animal, which is something I'd like to avoid.

Is there an easy way to make the third attempt work? Any more suggestions for implementing such a model?

I'd go with subclasses, I don't get how serialization is a big issue and single instances. You can use Dependency Injection. — Lukasz Madon, Jun 02 '12 at 22:18
If there are multiple instances that represent the same thing, you have to consider equality conditions. When two Animal types are equal? If the name is the same? What if I compare a serialized name from a previous version with a name in the current version? — Dominik, Jun 02 '12 at 22:34
You have the same serialization problems with all solutions. It should be handled by serialization logic. Subclasses are better for testing, encapsulation and future extension of you app. — Lukasz Madon, Jun 02 '12 at 22:47
@Dominik: Are you saying that if you change the name of `Otter` from `Otter` to `Sea Otter`, then you would want the old instances to be deserialized as `Otter` still -- even though they have different names? — Gabe, Jun 02 '12 at 23:06
@Gabe: If I changed the name of `Otter` to `Sea Otter`, I'd like all old instances to be deserialized as `Sea Otter` - the new name. — Dominik, Jun 03 '12 at 14:50

Chris Sinclair · Answer 1 · 2012-06-02T22:57:47.937

If you have issues with serialization, you can always separate the application-code from the serialization code. That is, place conversion classes that convert to/from your serialized state. The serialized instances can have exposed any empty constructors and properties needed and their only job is to serialize state. Meanwhile, your application logic works with the non-serializable, immutable objects. This way you do not mix your serialization concerns with logical concerns which brings with it a host of disadvantages as you are finding out.

EDIT: Here's some example code:

public class Animal 
{
    public string Name { get; private set; }
    public double Weight { get; private set; }
    public Habitat Habitat { get; private set; }

    internal Animal(string name, double weight, Habitat habitat)
    {
        this.Name = name;
        this.Weight = weight;
        this.Habitat = habitat;
    }

    public void Swim();
}

public class SerializableAnimal
{
    public string Name { get; set; }
    public double Weight { get; set; }
    public SerializableHabitat Habitat { get; set; } //assuming the "Habitat" class is also immutable
}

public static class AnimalSerializer
{
    public static SerializableAnimal CreateSerializable(Animal animal)
    {
        return new SerializableAnimal {Name=animal.Name, Weight=animal.Weight, Habitat=HabitatSerializer.CreateSerializable(animal.Habitat)};
    }

    public static Animal CreateFromSerialized(SerializableAnimal serialized)
    {
        return new Animal(serialized.Name, serialized.Weight, HabitatSerializer.CreateFromSerialized(serialized.Habitat));
    }

    //or if you're using your "Static fields" design, you can switch/case on the name
    public static Animal CreateFromSerialized(SerializableAnimal serialized)
    {
        switch (serialized.Name)
        {
            case "Otter" :
                return Animal.Otter
        }

        return null; //or throw exception
    }
}

Then your application logic for serialization might look something like:

Animal myAnimal = new Animal("Otter", 10, "North America");
Animal myOtherAnimal = Animal.Duck; //static fields example

SerializableAnimal serializable = AnimalSerializer.CreateSerializable(myAnimal);
string xml = XmlSerialize(serializable);
SerializableAnimal deserialized = XmlDeserializer<SerializableAnimal>(xml);

Animal myAnimal = AnimalSerializer.CreateFromSerialized(deserialized);

Just to reiterate, the SerializableAnimal class and usage is ONLY used in the final layer(s) of your application that need to serialize/deserialize. Everything else works against your immutable Animal classes.

EDITx2: Another major benefit of this managed separation is you can deal with legacy changes in your code. For example, you have a Fish type, which is pretty broad. Maybe you split it into Shark and Goldfish later and decide all your old Fish type should be considered Goldfish. With this separation of serialization, you can now place a check for any old Fish and convert them to Goldfish whereas direct serialization would result in an exception because Fish no longer exists.

Gabe · Answer 2 · 2012-06-02T23:07:38.267

I would implement it with subclasses, but where the instances of the subclasses don't store any data, like this:

public abstract class AnimalBase {
    public abstract string Name { get; } // user-readable
    public abstract double Weight { get; }
    public abstract Habitat Habitat { get; }
    public void Swim(); { /* swim implementation; the same for all animals but uses the value of Weight */ }

    // ensure that two instances of the same type are equal
    public override bool Equals(object o)
    {
        return o != null && o.GetType() == this.GetType();
    }
    public override int GetHashCode()
    {
        return this.GetType().GetHashCode();
    }
}

// subclasses store no data; they differ only in what their properties return
public class Otter : AnimalBase
{
    public override string Name { return "Otter"; }
    public override double Weight { return 10; }
    // here we use a private static member to hold an instance of a class
    // that we only want to create once
    private static readonly Habitat habitat = new Habitat("North America");
    public override Habitat Habitat { return habitat; }
}

Now it shouldn't matter that you have multiple "instances", because each instance only contains its type information (no actual data). Overriding Equals and GetHashCode on the base class means that different instances of the same class will be considered equal.

Hey, that's pretty nice. The objects should serialize just fine as well as all they need to store is their type name and no properties whatsoever. — Chris Sinclair, Jun 02 '12 at 23:18
This approach is interesting - looks easy to implement, easy to understand and serialization works without additional effort. I liked Chris's answer as well, but this one looks cleaner. — Dominik, Jun 03 '12 at 14:56

DreamSonic · Answer 3 · 2012-06-02T23:19:54.080

The way I see it, you are looking for the right creational pattern to suit your needs. Your first option is similar to factory method. The second one looks like a type hierarchy with an optional abstract factory. The third one is a singleton.

It seems like your only problem is serialization. What kind of serialization we're talking about: binary or XML? If it's binary, have you looked at custom serialization? If it's XML, you should either stick with the second option, also use custom serialization or delegate the serialization logic outside of your classes.

I personally think the latter is the most architecturally sound solution. Mixing object creation and serialization is a bad idea.

score 0 · Answer 4 · answered Jun 05 '12 at 08:12

I'd go with the third option (objects!), but with a little twist.

The point is: You have a set of objects with some particular schema...

public class Animal {

   public string Name { get; set; } // user-readable
   public double Weight { get; set; }
   public Habitat Habitat { get; set; }

   public void Swim();
}

but you want them to be predefined. The catch is: If you serialize such object, you don't want to have its fields serialized. Initializing the fields is the responsibility of application, and the only thing you want to actually have in your serialized version is the "type" of the animal. This will allow you to change "Otter" to "Sea Otter" and keep the data consistent.

Hence, you'd need some representation of the "animal type" - and that's the only thing you want to have serialized. On deserialization, you want to read the type identifier and initialize all the fields based on it.

Oh, and another catch - upon deserialization, you don't want to create a new object! You want to read the ID (and the ID only) and retrieve one of the predefined objects (that corresponds to this ID).

The code could look like:

public class Animal {

   public static Animal Otter;
   public static Animal Narwhal;

   // returns one of the static objects
   public static Animal GetAnimalById(int id) {...}

   // this is here only for serialization,
   // also it's the only thing that needs to be serialized
   public int ID { get; set; } 
   public string Name { get; set; }
   public double Weight { get; set; }
   public Habitat Habitat { get; set; }

   public void Swim();
}

So far, so good. If there are dependencies that prohibit you from making instances static, you could throw in some lazy initialization for all the Animal objects.

The Animal class starts to kind of look like "a couple singletons in one place".

Now how to hook it into .NET's serialization mechanism (BinarySerializer or DataContractSerializer). We want the serializer to use GetAnimalById instead of the constructor when deserializing, and only store ID when serializing.

Depending on your serialization API, you can do this with ISerializationSurrogate or IDataContractSurrogate. This is an example:

class Surrogate : IDataContractSurrogate {

    public Type GetDataContractType(Type type) {
        if (typeof(Animal).IsAssignableFrom(type)) return typeof(int);
        return type;
    }

    public object GetObjectToSerialize(object obj, Type targetType) {
        // map any animal to its ID
        if (obj is Animal) return ((Animal)obj).ID;
        return obj;
    }

    public object GetDeserializedObject(object obj, Type targetType) {
        // use the static accessor instead of a constructor!
        if (targetType == typeof(Animal)) return Animal.GetAnimalById((int)obj);
    }
}

BTW: DataContacts seem to have a bug (or is it a feature?) which causes them to act weirdly when the substitute type is a basic type. I've had such problem when serializing objeects as strings - the GetDeserializedObject method was never fired when deserializing them. If you run into this behaviour, use a wrapper class or struct around that single int field in the surrogate.

How to represent different entities that have identical behavior?

Unsuccessful attempts:

Enums

Subclasses

Static fields

4 Answers4