I'm working with a piece of code that processes messages of a queue (using masstransit). Many messages can be processed in parallel. All messages create or modify an object in ActiveDirectory (in this case). All objects need to be validated against the AD schema definitions. (Though its not relevant to the problem, I want to note that we have many customers with custom extension in their AD Schema)
Retrieving the schema information is a slow operation. I want to do it 1 time and then cache it. But with many parallel processing messages. Many messages start getting the schema information before the first succeeds. So too much work is done. For the moment I fixed this with a simple semaphore. See code below.
But that is not a good solution as now only 1 thread can enter this code all the time.
I need something to lock the code 1 time per object and hold off other request until the first retrieval and caching is complete.
What kind of construct will allow me to do that?
private static SemaphoreSlim _lock = new SemaphoreSlim(1, 1);
public ActiveDirectorySchemaObject? GetSchemaObjectFor(string objectClass)
{
//todo: create better solution
_lock.Wait();
try
{
if (_activeDirectorySchemaContainer.HasSchemaObjectFor(
_scopeContext.CustomerId, objectClass) == false)
{
_logger.LogInformation($"Getting and caching schema from AD " +
$"for {objectClass}");
_activeDirectorySchemaContainer.SetSchemaObjectFor(
_scopeContext.CustomerId, objectClass,
GetSchemaFromActiveDirectory(objectClass));
}
}
finally
{
_lock.Release();
}
return _activeDirectorySchemaContainer.GetSchemaObjectFor(
_scopeContext.CustomerId, objectClass);
}
The following is a possible simplification of the question. In short. I am looking for the proper construct to lock a piece of code for parallel acces for every variation of a input.
A comment mentioned Lazy. Something I have not used before. But reading the docs I see it defers initialization of an object until later. Maybe I could refactor for that. But looking at the code as it currently is, I seem to need an lazy "if" or an lazy "function", but maybe I am over complicating. I find thinking about parallel programming often hurts my head.
As requested the schema container class code containing setschemafor and the other functions. Thanks so far for all information provided.
public interface IActiveDirectorySchemaContainer
{
//Dictionary<string, Dictionary<string, JObject>> schemaStore { get; }
bool HasSchemaObjectFor(string customerId, string objectClass);
ActiveDirectorySchemaObject GetSchemaObjectFor(string customerId, string objectClass);
void SetSchemaObjectFor(string customerId, string objectClass, ActiveDirectorySchemaObject schema);
}
public class ActiveDirectorySchemaContainer : IActiveDirectorySchemaContainer
{
private Dictionary<string, Dictionary<string, ActiveDirectorySchemaObject>> _schemaStore = new Dictionary<string, Dictionary<string, ActiveDirectorySchemaObject>>();
public bool HasSchemaObjectFor(string customerId, string objectClass)
{
if (!_schemaStore.ContainsKey(customerId))
return false;
if (!_schemaStore[customerId].ContainsKey(objectClass))
return false;
if (_schemaStore[customerId][objectClass] != null)
return true;
else
return false;
}
public ActiveDirectorySchemaObject GetSchemaObjectFor(string customerId, string objectClass)
{
return _schemaStore[customerId][objectClass];
}
public void SetSchemaObjectFor(string customerId, string objectClass, ActiveDirectorySchemaObject schemaObject)
{
if (HasSchemaObjectFor(customerId, objectClass))
{
_schemaStore[customerId][objectClass] = schemaObject;
}
else
{
if (!_schemaStore.ContainsKey(customerId))
{
_schemaStore.Add(customerId, new Dictionary<string, ActiveDirectorySchemaObject>());
}
if (!_schemaStore[customerId].ContainsKey(objectClass))
{
_schemaStore[customerId].Add(objectClass, schemaObject);
}
else
{
_schemaStore[customerId][objectClass] = schemaObject;
}
}
}
}
The customerId is to separate schema information for multiple customers And the container is provided by dependency injection as a singleton. Every message can have a different customerId and be processed concurrently. Yet I want to retrieve schema data only a single time. This architecture might not be ideal, but I am not allowed to change that at this time.
public static IServiceCollection AddActiveDirectorySchemaService(
this IServiceCollection services)
{
services.AddScoped<IActiveDirectorySchemaService, ActiveDirectorySchemaService>();
services.AddSingleton<IActiveDirectorySchemaContainer, ActiveDirectorySchemaContainer>();
return services;
}