0

I am completely new to threading. I have such a method in which I am trying to implement parallel executions in thread safe manner(at least I hope so):

private void PerformSearch(List<FailedSearchReportModel> failedSearchReports)
    {
        foreach (var item in failedSearchReports)
        {
            item.SearchTerms = item.SearchTerms.Take(50).ToList();
            var siteId = ConstantsEnumerators.Constants.Projects.GetProjectIdByName(item.Site);
            if (SearchWrapperHelper.IsSeas(siteId))
            {
                item.UsedEngine = "Seas";
                var model = GetBaseQueryModel(item.Site);
                Parallel.ForEach(item.SearchTerms,
                         new ParallelOptions { MaxDegreeOfParallelism = Convert.ToInt32(Math.Ceiling((Environment.ProcessorCount * 0.75) * 2.0)) },
                         (term) =>
                     {
                         lock (seasSyncRoot)
                         {
                             CheckSearchTermInSeas(model, term, item.Site, item.Language);
                         }
                     });
            }
            else
            {
                item.UsedEngine = "Fast";
                Parallel.ForEach(item.SearchTerms, term =>
                    {
                        lock (fastSyncRoot)
                        {
                            CheckSearchTermInFast(term, item.Site, item.Language);
                        }
                    });
            }
        }
    }

Even though in guidelines it is mentioned for lock statement only to wrap as little amount of code as possible, here is how nested CheckSearchTerm method looks like:

private void CheckSearchTermInSeas(SearchQueryModel baseModel, FailedSearchTermModel term, string site, string language)
    {
        var projectId = ConstantsEnumerators.Constants.Projects.GetProjectIdByName(site);

        term.SearchTerm = ClearSearchTerm(term.SearchTerm).Replace("\"", string.Empty);
        var results = SearchInSeas(baseModel, term.SearchTerm, projectId, language);
        term.DidYouMean = GetDidYouMean(results?.Query.Suggestion, term.SearchTerm);
        term.HasResult = results?.NumberOfResults > 0;
        if (!term.HasResult && string.IsNullOrEmpty(term.DidYouMean))
        {
            results = SearchInSeas(baseModel, term.SearchTerm, projectId, InverseLanguage(language));
            term.WrongLanguage = results?.NumberOfResults > 0;
            if (!term.WrongLanguage)
            {
                term.DidYouMean = GetDidYouMean(results?.Query.Suggestion, term.SearchTerm);
            }
        }

        if (!string.IsNullOrEmpty(term.DidYouMean))
        {
            results = SearchInSeas(baseModel, term.DidYouMean, projectId, term.WrongLanguage ? InverseLanguage(language) : language);
            term.DidYouMeanHasResult = results?.NumberOfResults > 0;
            if (!term.DidYouMeanHasResult)
            {
                results = SearchInSeas(baseModel, term.DidYouMean, projectId, term.WrongLanguage ? language : InverseLanguage(language));
                term.DidYouMeanHasResult = results?.NumberOfResults > 0;
            }
        }
    }

Am I doing everyting right, can you please provide some explanation? Or should I change it? PS: Now if i need to write all this records into the file(excel) should I also use Parallel to increase performance? And if so approach would be the same?

  • Is that C# code? I would strongly recommend that you tag the question with the language in which the code is written. – John Bollinger Nov 02 '21 at 04:55
  • 2
    As far as i knew, the office interop libraries are not thread safe – TheGeneral Nov 02 '21 at 05:13
  • @TheGeneral, thank you! – red guardgen Nov 02 '21 at 05:17
  • I suppose we do not need to add the lock since we are not adding or modifying and common variable/list. If you have to suggest using ConcurrentDictionary instead of List. In addition, changing the private function to a static function if working with only Immutable objects improve the performance. – Rajeesh Madambat Nov 02 '21 at 05:33
  • 1
    I would suggest checking out pure methods and immutable types. These are thread-safe by default. If you have non-pure methods or do not know, I would recommend staying away from multi threading, or at least being very careful. Using a parallel loop around a lock in that way is a terrible idea, but we cannot know if the called methods are thread safe or not. I would suggest starting with profiling to see if multi threading would help in the first place. – JonasH Nov 02 '21 at 07:14
  • What is the type of application you are working on? WinForms? ASP.NET? Windows Service? – Theodor Zoulias Nov 02 '21 at 07:49
  • @TheodorZoulias, asp web forms – red guardgen Nov 02 '21 at 07:51
  • @JonasH, the problem is this component runs for 3 hours now( – red guardgen Nov 02 '21 at 08:03
  • runs for 3 hours doing what? IO? Calling into notoriously slow office interop? CPU processing? Multi threading would only help with the last one, and only if the processing is suitable for parallelism. Whenever performance is a concern start by *profiling* – JonasH Nov 02 '21 at 09:47

1 Answers1

1

In ASP.NET applications, threads are a precious resource. The more threads you have available, the more requests you can serve concurrently. The threads for serving requests and for doing parallel work are coming from the same pool, the ThreadPool. So the more parallel work you do, the less concurrent clients you can serve. Doing parallel work with the Parallel.ForEach is particularly nasty when the loop is not configured with the MaxDegreeOfParallelism option. This beast can single-handedly saturate your ThreadPool by using every available thread the pool has, and requesting even more. One non-configured Parallel.ForEach in your web app is enough to reduce the scalability of your application to nothingness.

The second Parallel.ForEach in your code, the one that searches with the item.UsedEngine = "Fast" setting, is not configured.

And what all these threads are going to do? Almost nothing. At most one or two threads will be doing work, and all the others will be blocked behind the lock, waiting for their turn. That's not an efficient way to utilize the resources of your server. By using parallelism, you made your web app slower for everyone. If you have a performance problem in a web application, introducing parallelism should be your last thought as a solution. It's much more likely to intensify the problem than to solve it.

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104