1

I am having trouble identifying the pros and con of these two example below. Ideally I would like to know which weighs high as best design:

Example 1 (Type Property):

public abstract class Animal : /* all relevant interfaces*/
{
    /*
    All necessary implementations
    */
}

Public class Dog : Animal
{
    Public string Name {get; set;}
    Public Breed Type {get; set;}

    /* 
    All necessary implementations
    */
}

Public Type Breed
{
    Bulldog,
    Chihuahua,
    Labrador,
    /*
    So on……
    This can grow without limitation, 
    you dno't know how much it will grow into, or 
    what breed you will have tomorrow e.g BulldogChihuahuaCross, 
    or BulldogLabradorCross.
    */
}

Example 2 (Empty Derived Classes):

public abstract class Animal : /* all relevant interfaces*/
{
    /*
    All necessary implementations
    */
}

Public class Dog : Animal
{
    Public string Name {get; set;}

    /* 
    All necessary implementations
    */
}

Public class Bulldog : Dog
{
   /* Empty Class */
}

Public class Chihuahua: Dog
{
   /* Empty Class */
}

Public class Labrador: Dog
{
   /* Empty Class */
}

Edit:

In the example 1, an instance of Dog is created and it's type is assigned as property, in example 2, an instance of that specific type is created.

I am looking for some indepth argument, scalability, maintainablity, cost on on CPU if there are 1000 types of breed, cost on running queries, ect.

Jegan
  • 1,227
  • 9
  • 16
  • This is actually a very good question that most people don't bother to ask. I have what I think is a good answer for you, but don't have time to write it up properly until later today. Stay tuned. – Jim L. Sep 20 '16 at 19:47
  • @JimL.I am looking forward to your answer. – Jegan Sep 20 '16 at 20:16
  • @JimL. Or like Fermat said: I have an excellent proof but the margin here is too narrow to write it down ;-) – qwerty_so Sep 20 '16 at 21:45
  • @ThomasKilian I wouldn't say no to hear your view :). – Jegan Sep 20 '16 at 22:09

3 Answers3

2

There are two extremes on a spectrum. On one end are sets of things that are merely informational and manipulated as text, numbers, or references to enumerated instances. For example, a dog salon application might record your pet’s name, color, and breed, merely as text for identification purposes. On the other end of the spectrum are sets of things that need to have different properties, operations, or methods. For example, a video game might want to emit a different sound for each breed of dog when it barks.

The decision one has to make as a designer is how accurately to reflect the “domain of discourse” vs. when to take shortcuts for a particular application. In the domain of discourse, each breed has characteristics that make it unique, but for the purposes of a dog salon application, you just don’t care. In that case, you could design away the detail and hope you never need it in the future.

The problem lies in the middle of that spectrum, where someone makes the wrong decision. I have worked on large enterprise systems, where someone squashed an inheritance hierarchy into types explicitly encoded as values in one or more database columns. This was horrific because it became the programmers’ problem to know which other columns were valid for every encoded type. The system had switch statements all over the place to implement business rules that evolved over time into something very complex. The system had lots of bugs.

For that reason, in the middle of the spectrum, where things are not so clear cut, I would err on the side of many classes. The reasoning is that accurately reflecting the domain makes it easier to tell whether or not requirements are met, and can make the code more intuitive to maintain (assuming it’s DDD). The key to reflecting the domain is accurately representing sets of things. For example, “Fido” is a member of the Dog set and of the Animal superset; “Sylvester” is a member of the Cat set and of the Animal superset. This is easy for people to understand, and each class can have different implementations hidden from the outside. Once you start interpreting types explicitly encoded as strings, integers, or enumeration-literal references, you start needing switch statements all over the place, and you have an unmaintainable, hard to understand mess. Many OO languages can save you from all that, but you have to create a factory with a complete list of types. The runtime overhead is negligible (especially if you implement the factory to run in constant time), and the types can be mapped to values in a relational database so you can query and report on them. Besides, as you point out, you may be able to take advantage of generics.

If you know your classes will always be empty, as is the case in a dog salon application, don’t bother with classes. If you know they will have different properties, operations, and methods, or if you’re not sure, use classes.

Jim L.
  • 6,177
  • 3
  • 21
  • 47
  • Thanks Jim. This is exactly my issue, the example I gave is much simpler than what I have in reality. I am working on an enterprise system that is much complex. It uses "Event sourcing" and DAG (directed Acyclic graph), so there are two parts to the system, part one creates the instance of the object, part two, those objects are registered with event sourcing which relies on heavy use of generics. To accomodate the use of generics, over the time, the empty classes has taken over. – Jegan Sep 21 '16 at 11:56
  • Although, at the time when the project started, there were lots of unkown, but now, it is somewhat settled. My question is it worth moving from domain of discourse, at the same time I want to be cautious that it should not end in big mess like switch statements or if's everywhere. – Jegan Sep 21 '16 at 11:56
  • Why take the chance of changing to an enumeration at this point? I pointed out the potential for nastiness down the road. I don't think classes have a real down side, except that they are annoying. – Jim L. Sep 21 '16 at 12:19
1

The first way would let you determine which breed the dogs are, but the second way would let you make specific implementations for the subclass.

Go with derived classes if you are going to need the objects to have different methods or properties for each different dog breed, otherwise it's simpler to just have the type property and a single class.

SilentLupin
  • 658
  • 10
  • 16
  • I am looking for an indepth argument, on the pros and cons of both design; for example say, this is an enterprise system with 1million dogs and 1000 breeds. – Jegan Sep 20 '16 at 19:44
  • 1
    There's already a rather lengthy discussion on it by user Kasper Holdum: [http://stackoverflow.com/questions/1338391/when-to-subclass-instead-of-differentiating-the-behaviour](http://stackoverflow.com/questions/1338391/when-to-subclass-instead-of-differentiating-the-behaviour) – SilentLupin Sep 20 '16 at 19:51
  • 1
    Actually, this is almost identical to your question and is answered pretty well: [http://stackoverflow.com/questions/4254182/inheritance-vs-enum-properties-in-the-domain-model](http://stackoverflow.com/questions/4254182/inheritance-vs-enum-properties-in-the-domain-model) – SilentLupin Sep 20 '16 at 19:57
  • 1
    Neither of the links satisfy my question, I am not looking for the meaning of inheritance. I am looking for which is scalable, and maintanable at enterprise level. If you look at the derived classes they are empty, because they represents a uniqueness, they do not add any functionality. – Jegan Sep 20 '16 at 20:20
  • @jagan then why do you add sub-classes if you don't extend them with functionality? That does not make sense. – qwerty_so Sep 20 '16 at 21:49
  • @ThomasKilian the sub class can be used to create unique instance, hence I can have a generic query extension such as Retrieve(string name), this will always find that unique type, on the other hand I have to write a query for example 1, such as Retrieve(Type breed, string name). This will have to search through every instance of dog and filter by type and name, which sounds costly on cpu, but I don't know the if that is true. – Jegan Sep 20 '16 at 21:56
  • @ThomasKilian This is where I am confused, When I draw the UML for the example 2, it looks pointless, but when I think about the generic extension, it makes sense, so which one has the most benefit? – Jegan Sep 20 '16 at 22:04
  • I think implementation-wise you'd always put things in a hash you need fast. And what that keeps is type-irrelevant. Neither will give you a faster access (than a hash). – qwerty_so Sep 20 '16 at 23:40
1

Ok, so here are my 2 cents: If you are looking for a fast access to anything, you'd put it in a associative array. Both of your implementations deliver a hash code to be used (being the object address) wich makes them accessible at the same speed. Sub-classing makes sense (basically) only if you intend to add functionality. Getting the type from a classifier is no different to getting it from the enumeration. If you need clever queries, create a clever associative array.

qwerty_so
  • 35,448
  • 8
  • 62
  • 86
  • That is a very good point. I am trying to move a way from empty sub classes to list of types, but I want to make sure that I don't create a mess in the design. – Jegan Sep 21 '16 at 12:01
  • Design-wise (as said) the enumeration is the better choice since you only create sub-classes for typing and not to add functionality which is their real purpose. – qwerty_so Sep 21 '16 at 12:41