I'm implementing a caching table to avoid having to perform a costly operation that creates a generic object from a set of parameters describing it. Once an object is requested, an hash of these parameters is computed, and a Dictionary containing the already created objects is queried to check if a copy has already been created, in which case its returned without the need of creating it again.
My problem lies in the fact that since the parameters describing these objects can be many, collisions in the hashing function are unavoidable (and too frequent), but on the other hand retrieving these objects is a performance-critical operation and i cannot afford full comparisons on all existing descriptions to search among already created objects. I've tried to solve with many different hashing functions but since the nature of the parameters is unknown the results are unreliable.
What solutions other than hashing are there to this caching problem, or can hashing be used differently to avoid conflicts? C# description of the problem:
class ObjectDescriptor
{
// description made of a list of parameters of unknown type
public object[] Fields;
// hashing procedure that may have conflicts
public override int GetHashCode()
{
int hash = 1009;
for (int i = 0; i < Fields.Length; i++)
{
unchecked { hash = hash * 9176 + Fields[i].GetHashCode(); }
}
return hash;
}
}
abstract class ObjectCache<T>
{
private Dictionary<int, T> indexedObjects;
// this operation is called many times and must be fast
public T Get(ObjectDescriptor descr)
{
T cachedValue;
if(!indexedObjects.TryGetValue(descr.GetHashCode(), out cachedValue))
{
cachedValue = CreateObject(descr);
indexedObjects[descr.GetHashCode()] = cachedValue;
}
return cachedValue;
}
// costly operation
protected abstract T CreateObject(ObjectDescriptor desc);
}