My goal is to speed up a query, and I thought to leverage parallelism, lets assume that I have 2,000 items in ids list, and I split them to 4 lists each one with 500 ids, and I want to open 4 treads that each one will create a DB call and to unite their results, in order to achieve that I used Parallel.ForEach, but it did not improved the performance of the query because apparently it does not well suited to io bound operations: Parallel execution for IO bound operations
The code in the if block uses parallel for each, vs the code in the else block that do it in a regular foreach.
The problem is that the method that contains this query is not async (because it is in a very legacy component) and it can not be change to async, and basically I want to do parallel io bound calculation inside non async method (via Entity Framework).
What are the best practices to achieve this goal? I saw that maybe I can use Task.WaitAll()
for that, I do not care to blocking the thread that runs this query, I am more concerned that something will went wrong with the Task.WaitAll()
that is called from a non async method
I use Entity Framework as ORM over a SQL database, for each thread I opens a separate context because the context is not thread safe.
Maybe the lock that I use is the one that cause me the problem, I can change it to a ConcurrentDictionary
.
The scenario depicted in the code below is simplified from the one I need to improve, in our real application I do need to read the related entities after I loaded there ids, and to perform a complicated calculation on them.
Code:
//ids.Bucketize(bucketSize: 500) -> split one big list, to few lists each one with 500 ids
IEnumerable<IEnumerable<long>> idsToLoad = ids.Bucketize(bucketSize: 500);
if (ShouldLoadDataInParallel())
{
object parallelismLock = new object();
Parallel.ForEach(idsToLoad,
new ParallelOptions { MaxDegreeOfParallelism = 4 },
(IEnumerable<long> bucket) =>
{
List<long> loadedIds = GetIdsQueryResult(bucket);
lock (parallelismLock)
{
allLoadedIds.AddRange(loadedIds );
}
});
}
else
{
foreach (IEnumerable<long> bucket in idsToLoad)
{
List<long> loadedIds = GetIdsQueryResult(bucket);
allLoadedIds.AddRange(loadedIds);
}
}