Associate data with concrete subclasses from a base abstract type

Question

I've stumbled upon the need to do this a few times recently with some lower level framework type stuff and I'd like to see if there is a better/cleaner way to accomplish this, i.e. if I'm missing something obvious or clever, like the time I discovered [ThreadStatic] to replace dictionary lookups against thread IDs for associating data with Threads.

I have a base abstract class, lets call it Entity. Every Entity needs to perform a set of initialization actions in the constructor that depends on the actual concrete class being instantiated. Is there a way I can accomplish this without doing a dictionary lookup and calling this.GetType()?

Here is some code similar to what I have now:

public abstract class Entity
{
    private static Dictionary<Type, Action<EntityData>> _initActions = new Dictionary<Type, Action<EntityData>>();

    private EntityData _data = new EntityData();

    protected Entity()
    {
        _initActions[this.GetType()].Invoke(_data);
    }
}

public class Employee : Entity
{
    public string Name { get; set; }
}

public class Manager : Employee
{
    public List<Employee> Subordinates { get; set; }
}

The Employee constructor and Manager constructor need to initialize their _data fields differently as they are different types. The _initActions collection gets initialized in another method prior to any instances being new'd up, which I don't think has any significance on this discussion.

I want usage of the class to remain as simple as possible for the user of the framework, so I can't use strange hacks like requiring users to override an Init method in each concrete type in some peculiar or unintuitive way.

Generics almost work, in the sense that I could do something like Entity<TEntity> to get a TEntity specific static field to store the init method if I didn't have any inheritance, but inheritance needs to be supported so I would need a dictionary of all init methods for the subclasses of TEntity anyway.

This code runs in some pretty low level database engine type scenarios in tight loops with 1m iterations, so getting rid of the dictionary lookup does provide some significant speedups in certain situations (tested by replacing with a hacky Init override implementation).

Any ideas?

EDIT:

I want to make a few things clear. The entity engine automatically sets up _initAction to do what it needs to to initialize its _data container. The "user" of the library knows nothing about this process and doesn't need to. All I was inquiring about is a way to avoid a dictionary lookup to get type-specific runtime information from a base class, but that may not be possible.

Yes, this is micro-optimization, but we have tested this with real queries and gotten 15-20% query time reductions on some queries that need to instantiate large datasets.

The faster code looked like this:

public class Employee : Entity
{
    private static EntityInitializer _initMethod = Entity.GetInitMethod(typeof(Employee));

    public string Name { get; set; }

    public Employee()
    {
        _initMethod.Invoke(this);
    }
}

This way, the dictionary lookup is done once for the Employee type. It's not horrible, but it requires a) boilerplate in every single class, which I don't like and b) slightly error prone as you have to match up the type parameter with the current class otherwise funky things happen, kind of like when you type in the wrong owner class name for a dependency property in WPF. Kinda sometimes works, but then wierd bugs pop up and its hard to trace back.

What it comes down to is this: is there was a better way to attach arbitrary runtime data to a Type besides using a Dictionary, considering that all these types that will have this data attached to them all implement a common base class?

Can you explain why you're using an abstract base class instead of an interface? Why not just define an interface with an init method and make all the sub classes implement that interface? — evanmcdonnal, May 01 '13 at 15:52
Why not just have another ctor that receives the type and pass the type? — paparazzo, May 01 '13 at 15:55
@evanmcdonnal The initAction method isn't user generated, it's automatically generated based on the class definition. Having the user implement an interface method would be the same as having them override an abstract Init method in some wierd hacky way, which I covered above. It's an abstract base class and not an interface because it implements a ton more functionality then I am showing above. — Mike Marynowski, May 01 '13 at 15:59
@Blam That could possibly replace the GetType() call by making the user call a base constructor with typeof(Employee) or whatever, but that would still require the dictionary lookup to get the actual init method for that type, which is the slow part. I'll take the GetType() call over making the user call a base constructor a certain way and having checks to verify that it is done right (i.e. what if they pass the wrong type?). — Mike Marynowski, May 01 '13 at 16:02
you're creating your own virtual table here to support inheritance (as I can see it) - and you seem to be automating that somehow (based on your last comment). I'm not sure I get it, you're already riding on the sidelines here and this feels more hacky to me (though I understand your point about users sort of). It's unclear how do you want this `marriage` to work, you don't want inheritance but you'd still like something better. Is the main request here for user classes to not having to implement anything? — NSGaga-mostly-inactive, May 01 '13 at 16:03
Most low-level stuff gets hacky when you are squeezing out performance, I don't really mind that. I don't mind hacky stuff in MY code, I just don't want the users to feel like using the class is hacky, if that makes sense. The users can implement stuff, I just don't like repetitive boilerplate to have to be implemented in a particular way in every single class. The Entity class takes care of determining how a particular type needs the _data field to be initialized automatically based on its definition, which I'm storing in _initActions. — Mike Marynowski, May 01 '13 at 16:08
@MikeMarynowski alright, I was just checking because the functionality you showed is minimal and doesn't merit any complicated solution. — evanmcdonnal, May 01 '13 at 16:10
Essentially if I could have my cake and eat it too, I would love something like [ThreadStatic] called [TypeStatic] that would be available for each concrete type. Then I could just do _initAction.Invoke(_data), where _initAction is [TypeStatic] private static Action so that I can avoid the dictionary lookup. — Mike Marynowski, May 01 '13 at 16:14
For the sake of this question, we can say _initActions gets populated by a static method called Entity.Initialize(), which finds all classes that implement Entity, reflects them to determine what fields they have and how much data storage each one will need based on that, and then populates _initActions with an entry that initializes the passed in EntityData object to the right size. It can just as easily be a Dictionary or something of the sort, and I could fill the _data field right inside the Entity constructor based on the information in EntityInfo. — Mike Marynowski, May 01 '13 at 16:18
If you wouldn't mind giving me a taste of what you are getting at that would be wonderful :) — Mike Marynowski, May 01 '13 at 16:20
I wouldn't really call it IoC. What it comes down to is that I just want a set of type-specific data, set during initialization at runtime, to be accessible to the constructor of all types that implement my base class. — Mike Marynowski, May 01 '13 at 16:25
I might be misunderstanding the question completely here, but why don't you just define a `protected abstract` method that the `Entity` base constructor calls? That way, subclasses will be forced to implement it and it'll still be executed as early as you want it and you won't be doing any reflection. Sure, you might get some warnings from VS, but if this is what you want to do, you can ignore them. — Theodoros Chatzigiannakis, May 01 '13 at 17:00
Reflection isn't a problem and is necessary - it is only done once at application initialization to determine the structure of the objects so that the entity engine can cache the information about each type, and initialize the _data field of each object accordingly when they are constructed. The "user", i.e. the person implementing Employee and Manager, has nothing to do with this process. They new up an object and just use it. They have no knowledge of the private EntityData _data field. — Mike Marynowski, May 01 '13 at 17:39

paparazzo · Accepted Answer · 2013-05-01T17:06:03.183

2

Could you not just create a ctor that you pass the type to?

    protected Entity(Type type)
    {
        _initActions[type].Invoke(_data);
    }
}

public class Employee : Entity
{
    private static Type mytype = typeof(Employee);
    public string Name { get; set; }
    public Employee(): base(mytype)
    { }
}

The lookup is causing performance issues?
Dictionary lookup is 0(1) and few milliseconds.
A program can only have so many classes.
Entity still needs to create the object, create a new EntityData, and run Invoke.
In addition to initialization of the classes that implement Entity.

edited May 01 '13 at 17:06

answered May 01 '13 at 16:08

paparazzo

44,497
23
105
176

As commented above, it's more the dictionary lookup I'm trying to avoid as opposed to the GetType() call. – Mike Marynowski May 01 '13 at 16:11
@MikeMarynowski Point taken. See my update. Static should just be called once. – paparazzo May 01 '13 at 16:13
OK I get what you are saying about the Dictionary lookup. Dictionary lookup is fast. That lookup is causing performance issues? And the stated question is "without doing a dictionary lookup and calling this.GetType()?" – paparazzo May 01 '13 at 16:25
Removing the type dictionary lookups on certain queries that have to instantiate large datasets reduced the query time by 15 - 20%, which can be a big deal on heavily loaded DB servers. Yes, it's micro-optimizations at this point, but that's the point we are at with our product...10% here, 10% there and you have a query engine that returns results in half the time. – Mike Marynowski May 01 '13 at 17:35
Sorry still not buying that Dictionary lookup (one of the fastest operations in .NET) is your bottleneck. In a comment you stated you concluded the lookup was the problem by hacking in an init. Put this in a profiler and measure performance. If you are optimizing there have to be bigger fish to fry than a dictionary lookup in a ctor. Why are you using class not sturct? Why are you not overiding gethashcode? Really there have to be bigger fish to fry. Could you make it an array and pass an ordinal index? – paparazzo May 01 '13 at 18:03
These are low level classes that almost every application object implements and here every little bit counts. We've fried the bigger fish and during optimization testing we noticed we could get a decent boost by avoiding this. When you query over a dataset with millions of records, these dictionary lookups add up. I have posted a simplified example of what I'm trying to avoid, I'm not going to paste in a 20k LOC DB engine in here. My question is relatively simple - can I attach data to all the types in an object hierarchy that is accessible from the base type without a dictionary lookup? – Mike Marynowski May 01 '13 at 18:11
Sorry then I don't know how to do what you want to do. – paparazzo May 01 '13 at 18:37
That's what we've done with the individual field data but that's after the constructor runs and sets up the _data container with all the ordinal information - I think I'm stuck with either a dictionary lookup or forcing the users of the library to write a specific line of code. It just nags at my OCD because it's so close to perfect lol. I now have this in all the class constructors: `Entity.Init(this)`. No more static field to hold the initializer, the generic class does it. I think that's about as good as it will get. – Mike Marynowski May 01 '13 at 18:38
Just looking through old SO posts and stumbled upon this again, and I wish I could just delete it because it's all a bit convoluted as it was somewhat difficult to explain, resulting in this mess of a thread lol. That said, I'll mark this as the answer because you were definitely right about the dictionary lookup, it was not what was causing the speed issue at all. I think I missed the O(1) part of your answer - I didn't realize at the time dictionaries were so damn fast and I made a silly mistake interpreting the profiler results. – Mike Marynowski Oct 28 '17 at 07:23

Jodrell · Answer 2 · 2013-05-01T17:04:24.287

2

Why does the type of the sub class effect the way an encapsulated class should be populated? This seems like a violation of some OO principles to me.

If there is some specialized behaviour for a subclass, then

public abstract class Entity
{
    private readonly EntityData data = InitializeData(new EntityData());

    protected abstract void InitializeData(EntityData data);
}

seems like a better definition for the base class. The specilased action can be defined in the sub class,

Public class Employee : Entity
{
     protected override void InitializeData(EntityData data)
     {
        // Employee specific implementation here ...
     }
}

This requires no Dictionary, lookup or even a switch statement. No static state is required. It means the sub class related code has to be in the sub class but, that is a good thing, that is OO.

If its necessary to preserve more of what you have your could do somthing like,

public abstract class Entity
{
    private readonly EntityData data;

    protected Entity(Action<EntityData> initializeData)
    {
        this.data = initializeData(new EntityData());
    }
}

public class Employee : Entity
{
    public Employee : base(SomeStaticAction)
    {
    }
}

edited May 01 '13 at 17:04

answered May 01 '13 at 16:28

Jodrell

34,946
5
87
124

@Bobson I feel like your answer is heading that way. – Jodrell May 01 '13 at 16:57
Yeah, the concepts are there, but I didn't specifically call it out. – Bobson May 01 '13 at 17:00
The issue is that the user (i.e. the person implementing Employee) doesn't determine what the init actions are. The init actions are determined beforehand by the Entity engine - it figures out how to initialize the data container based on what fields the entity has. – Mike Marynowski May 01 '13 at 17:05
@MikeMarynowski The actions of the engine must be determined after the subclass is programmed . One is the logical inception of the other. Otherwise, there is no extensibility. How can the engine know the correct action for some as yet undefined sub class? Reflection, Attributes? If you use the dictionary, how is that maintained? – Jodrell May 01 '13 at 17:13
What? Of course they don't. The subclass inherits behaviour from Entity, part of which is change tracking and notification, property dependency tracking, etc...which requires the _data field to be initialized based on what fields you have defined on your class. – Mike Marynowski May 01 '13 at 17:16
@MikeMarynowski the common bits go in the base class, the specialized bits go in the sub class. – Jodrell May 01 '13 at 17:18
Yes, and the common bits are change tracking and notification, property dependency tracking, etc....which in order to work, need to initialize _data depending on the field structure of the current class. – Mike Marynowski May 01 '13 at 17:40

Bobson · Answer 3 · 2013-05-01T16:44:05.553

0

I really feel like you're overthinking this. Why not just have Entity have an abstract get-only property that needs to be overridden?

public abstract class Entity
{
    private static Dictionary<Type, Action<EntityData>> _initActions = 
                new Dictionary<Type, Action<EntityData>>();

    protected abstract EntityData _data { get; }

    protected Entity()
    {
        _initActions[this.GetType()].Invoke(_data);
    }
}

public class Employee : Entity
{
    public string Name { get; set; }
    protected overrides EntityData _data { 
         get { return new EntityData("Employee Stuff"); } 
    }
}

public class Manager : Employee
{
    public List<Employee> Subordinates { get; set; }
    protected overrides EntityData _data { 
         get { return new EntityData("Manager Stuff"); } 
    }
}

Alternatively, just have two Init methods.

public abstract class Entity
{
    private static Dictionary<Type, Action<EntityData>> _initActions = 
                new Dictionary<Type, Action<EntityData>>();

    private void InitalizeBase() { /* do shared construction */ }
    protected abstract void Initalize();

    protected Entity()
    {
        InitalizeBase();
        Initalize();
    }
}

public class Employee : Entity
{
    public string Name { get; set; }
    protected overrides Initalize()
    {
        // Do child stuff
    }
}

edited May 01 '13 at 16:44

answered May 01 '13 at 16:38

Bobson

13,498
5
55
80

Because the user of the library doesn't determine what the initialize actions are, the entity itself does. The way a particular type intializes its data container depends on how many and what type of fields it has, which is determined by the library automatically. – Mike Marynowski May 01 '13 at 17:02
@MikeMarynowski - So? Write that into an automatically generated `Initalize()` method. If the fields are being generated automatically, they can be initialized automatically. – Bobson May 01 '13 at 17:10
The fields aren't being generated automatically. The user of the class defines the fields, and at runtime the structure of the object is reflected by the entity engine and when the constructor of each entity runs the _data field is initialized accordingly. – Mike Marynowski May 01 '13 at 17:17
In my example above, Employee and Manager are classes written by the "user" of the library. The library itself has the "Entity" class in it. – Mike Marynowski May 01 '13 at 17:18
@MikeMarynowski - Then how is it going to know what to do with an `EntityData`, even if it could automatically generate it? – Bobson May 01 '13 at 17:20
I don't follow what you are confused about. An initialization method reflects all the classes that implement Entity and setup the _initAction method accordingly...so, for example, after tallying the storage size, it does: _initActions.Add(entityType, data => data.InitStorage(entityStorageSize)); – Mike Marynowski May 01 '13 at 17:26
@MikeMarynowski - At this point, I'm going to suggest you go over to [codereview.stackexchange.com] and post a *working* sample of your code for optimization, because I still have no idea why what you're trying to do this. Are you saying that you dynamically generate the initialization routines at runtime and inject them into the class? – Bobson May 01 '13 at 17:43
Okay, what it comes down to is this: is there a better/faster way to attach arbitrary data to a particular Type at runtime besides using a dictionary? – Mike Marynowski May 01 '13 at 17:54
It doesn't inject anything - it invokes it in the Entity constructor, as shown. So if by injecting you mean invoking it for every Entity as part of the base class behavior, then yes. My specific use case isn't really that important, what is important is that I want to attach data to a type, and I'm trying to see if there is faster way of doing it than a dictionary lookup, with the added bonus of me having control of the base class of the hierarchy of all the types I want to get data about. – Mike Marynowski May 01 '13 at 17:59
@MikeMarynowski - If you want to attach data to a class at compile time, then you do it by adding the data to the class. If you want to attach metadata to a class at compile time, then you either add it as a `virtual`/`override` property or you can create your own attributes for it. If you want to do any of this at runtime, that's an entirely different question, which has nothing to do with this at all, and is [probably not possible](http://stackoverflow.com/a/10047350/298754). – Bobson May 01 '13 at 18:06
As I've said, the issue is doing this at runtime...I'm fully aware of how it can be done at compile time. Your link led me to TypeDescriptor which might actually work here, I'm going to do some measurements and see what I get. – Mike Marynowski May 01 '13 at 18:16
@MikeMarynowski - Doing it at runtime is what I meant by "injecting". You're taking compiled code, and effectively adding more to it during runtime. It sounds more like you're adding *metadata* to it, rather than functionality, but it'd still be injection. I know little about it, so I'll bow out of the discussion now. Good luck. – Bobson May 01 '13 at 18:29

Associate data with concrete subclasses from a base abstract type

3 Answers3