11

Abstract

I am writing an application which has a few object caches. The way it needs to work is when an object is retrieved from the cache:

object foo = CacheProvider.CurrentCache.Get("key");

foo should be a local copy of the original object, not a reference. What is the best way to implement this? The only way I have in mind so far is to use a BinarySerializer to create a copy, but I feel like I am missing a better way.

Details

The backing for the cache implementation is arbitrary, as it is provider-based. I need to support any number of caches, from the HttpRuntime cache to something like Velocity. The focus here is on the layer between the cache backing and the consumer code - that layer must ensure a copy of the object is returned. Some caches will already do this, but some do not (HttpRuntime cache being one).

Community
  • 1
  • 1
Rex M
  • 142,167
  • 33
  • 283
  • 313
  • Once in a while, it is good to see experts asking questions ;) BTW, how deep your object can be? does it refer other object references etc? – shahkalpesh Jul 21 '09 at 02:55
  • @shahkalpesh High score != expert, but cheers. The caches should ideally be able to store any object within reason. I can probably add the requirement that other objects are lazy-loaded. – Rex M Jul 21 '09 at 02:57
  • Rex, could you describe your scenario a little more? Do you need every call to handle you a new copy of that same object in a client scenario? Is your cache inproc or out of proc? Is this a server app (I guess so)? Do you have stateful clients connected to this app? Do you have some caching layer on the clients as well? – user134706 Jul 21 '09 at 12:47
  • @runtime let's just focus on the server for now. The cache may be either in proc or out of proc, as the implementation details of the cache are left to the provider. This mechanism should essentially be stateless. When I get a copy of an object out of cache, I can create a local session and attach it if necessary. – Rex M Jul 21 '09 at 13:52
  • The reason I asked those questions was to better understand your requirements, since I'm not sure why would you need such an access pattern. I mean, yeah you should copy on write for synchronization purposes, but why also on read? It's not like cached pieces of data would be updated inplace internally. – user134706 Jul 21 '09 at 17:32
  • 1
    @runtime because the calling thread may do any number of arbitrary things to its copy of the object, including setting properties & changing values, which I do not want to be reflected in the cache or on the same entity in other threads. Committing changes and new objects back to the cache goes through an entirely different workflow. – Rex M Jul 21 '09 at 17:54

3 Answers3

5

Rex - we implemented something very similar in a big enterprise web product, and we ended up explicitly creating an IDeepCloneable interface that many of our objects implemented. The interesting thing is that we backed it with BinarySerializer, simply because it's a convienent and virtually foolproof way to make a deep copy of an object without all the hassle of reflection or forcing developers to write and test clone methods.

It works like a charm. Our team has stopped to mull over it several times and we have yet to come up with a better option.

womp
  • 115,835
  • 26
  • 236
  • 269
2

Sorry, I must be underthinking this. Why not lookup the objects in a dictionary and then create a deep copy of the object (e.g., IDeepCopyable) or use reflection to do the job?

Roughly:

public interface IDeepCopyable {
    object DeepCopy();
}

public class Cache<TKey, TValue> where TValue : IDeepCopyable {
    Dictionary<TKey, TValue> dictionary = new Dictionary<TKey, TValue>();

    // omit dictionary-manipulation code

    public TValue this[TKey key] {
        get {
            return dictionary[key].DeepCopy(); // could use reflection to clone too
        }
    }
}

If you go the reflection route, Marc Gravell has some nice cloning code easily modified to make a deep copy.

jason
  • 236,483
  • 35
  • 423
  • 525
  • Thanks. I don't quite follow - can you elaborate? – Rex M Jul 21 '09 at 02:49
  • This is a good answer given the limited information in my question. However - I cannot reasonably make the object responsible for copying itself, and I cannot use a dictionary, as the cache providers need to be much more robust, and have the option of using another storage backing (such as HttpRuntime cache). – Rex M Jul 21 '09 at 03:00
  • @Rex M: For the cloning, can you use the reflection route then? For caching the objects, take a look at Velocity (http://msdn.microsoft.com/en-us/data/cc655792.aspx). – jason Jul 21 '09 at 03:01
  • @Rex M: If I understand the additional details that you just appended to your question, you're looking to write an adapter on top of existing caching solutions such that the adapter will provide a copy of an object stored in the cache to consumers? I still must be underthinking this but I don't see how the solution isn't lookup object in cache, return copy of object created using reflection. – jason Jul 21 '09 at 03:14
  • 1
    @Jason you're definitely not underthinking it. You've described what I am after - if reflection is the right way to go, please say so. And if so, what are the benefits of reflection over binary serialization? – Rex M Jul 21 '09 at 03:19
2

Here is a simple function that will use reflection to deep copy and object, regardless of type. I culled this, in part from a much more complex copy routine I used in the old Web Services days to copy between Nearly Identical(tm) data types. It might not work exactly, bit it gives you the general idea. It is very simplistic and when using raw reflection there are many boundary cases...

public static object ObjCopy(object inObj)
{
    if( inObj == null ) return null;
    System.Type type = inObj.GetType();

    if(type.IsValueType)
    {
        return inObj;
    }
    else if(type.IsClass)
    {
        object outObj = Activator.CreateInstance(type);
        System.Type fieldType;
        foreach(FieldInfo fi in type.GetFields())
        {
            fieldType = fi.GetType();
            if(fieldType.IsValueType) //Value types get copied
            {
                fi.SetValue(outObj, fi.GetValue(inObj));
            }
            else if(fieldType.IsClass) //Classes go deeper
            {
                //Recursion
                fi.SetValue(outObj, ObjCopy(fi.GetValue(inObj)));
            }
        }
        return outObj;
    }
    else
    {
        return null;
    }
}

Personally, I would use the Serializer routines, since they automatically handle all the boundary cases. By default, at least in .NET 2.0, the system dynamically creates the Serialization assembly. You can speed up Serialization by caching the Serializers. This sample code from my app is caching the XmlSerializers.

protected static XmlSerializer SerializerGet(System.Type type)
{
    XmlSerializer output = null;
    lock(typeof(SerializeAssist))
    {
        if(serializerList.ContainsKey(type))
        {
            output = serializerList[type];
        }
        else
        {
            if(type == typeof(object) || type == typeof(object[]) || type == typeof(ArrayList))
            {
                output = new XmlSerializer(type, objArrayTypes);
            }
            else
            {
                output = new XmlSerializer(type);
            }
            serializerList.Add(type, output);
        }
    }
    return output;
}
Bill Crim
  • 612
  • 4
  • 7
  • Thanks William. I have done something very similar to your XML serializer - FYI, yours does not look thread-safe. – Rex M Jul 21 '09 at 03:25
  • HAHAHAH I removed the locking and thread safety code to make it clearer. :-) – Bill Crim Jul 21 '09 at 03:28
  • 1
    Ah :) I'd recommend adding it back... We don't want some poor dev stumbling on to this page a year from now and dropping that into her production app! – Rex M Jul 21 '09 at 03:33