1

I'm trying to improve the execution time for a operation.

I get a list of users, and by the user id I'm executing a method from a service which calculates some data. After i get the data by the user id, I set it to every entity, and than just save the context.

Currently with this simple code it take about 1 hour. Approximately 21 - 24 seconds per 100 users. But it depends on each and every user.

using (var ctx = new MyContext())
{
    var users = ctx.Users.Where(u => u.Status != -1).ToList();

    foreach (var user in users)
    {
        user.ProfilePercentage = await UserAlgorithm.CalculatePercentage(user.Id);
    }
    await ctx.SaveChangesAsync();
}

I also tried like this, but takes even longer:

Parallel.ForEach(
    users,
    new ParallelOptions { MaxDegreeOfParallelism = -1 },
    user => {
        user.ProfilePercentage = await UserAlgorithm.CalculatePercentage(user.Id); }
);

**CalculatePercentage **

public async Task<decimal> CalculatePercentage (int userId)
{
    decimal completeness = 0;
    bool? hasEducationsTags  = null; //--> nullable boolea will handle 3 cases on SP side
    bool? hasPosition    = null; //--> nullable boolea will handle 3 cases on SP side

    ///-- check if user has any education
    if(await SchoolService.HasEducations(userId).ConfigureAwait(false))
    {
        hasEducationsTags = await SchoolService.HasAnyFieldsOfStudy(user=>user.Id == userId);
    }

    ///-- check if user can set position
    if (await UserInfoService.CanSetLocation(userId).ConfigureAwait(false))
    {
        ///-- check if position was set
        string accountName = await UserInfoService.GetAccountName(userId).ConfigureAwait(false);
        hasPosition = await HasUserSetPosition(accountName).ConfigureAwait(false);
    }

    ///-- call store procedure
    using (SqlConnection sqlConnection = ConnectionFactory.Create())
    {
        await sqlConnection.OpenAsync().ConfigureAwait(false);

        using (SqlCommand sqlCommand = sqlConnection.CreateCommand())
        {
            sqlCommand.CommandType = CommandType.StoredProcedure;
            sqlCommand.CommandText = "ProfileCompletionAlgorithm";
            sqlCommand.Parameters.AddWithValue("@USER_ID", userId);
            sqlCommand.Parameters.AddWithValue("@HAS_POSITION", hasPosition);
            sqlCommand.Parameters.AddWithValue("@HAS_SCHOOL_TAGS", hasEducationsTags);

            ///-- not using OUTPUT parameter but by getting the return result
            SqlParameter sqlParameter = sqlCommand.Parameters.Add("@RESULT", SqlDbType.Decimal);
            sqlParameter.Direction    = ParameterDirection.ReturnValue;
            sqlCommand.ExecuteNonQuery();
            completeness              = Convert.ToDecimal(sqlParameter.Value);
        }
    }
    return completeness;
}

Some of the links that I've checked:

How can I optimize this logic in order to get the best time?

If you chose to downvote please explain

Thanks.

Gerald Hughes
  • 5,771
  • 20
  • 73
  • 131
  • 1
    Instead of awaiting each call just capture the resulting Tasks and do a `Task.WaitAll`. – juharr Jun 09 '17 at 12:45
  • 1
    could it be that by creating so many threads you get a lot of context switching overhead? – NtFreX Jun 09 '17 at 12:46
  • why do you think doing it in parallel will make it faster? – Keith Nicholas Jun 09 '17 at 12:46
  • Does UserAlgorithm.CalculatePercentage make some database queries? – Sir Rufo Jun 09 '17 at 12:50
  • if the ressource which is needed most is the processor I would only create as many threads as logical processor cores you have. – NtFreX Jun 09 '17 at 12:50
  • 5
    Whats happening in the UserAlgorithm.CalculatePercentage? – WBuck Jun 09 '17 at 12:52
  • @juharr maybe creating a list of tasks for multiple users, and than use Task.WhenAll(tasks); ? – Gerald Hughes Jun 09 '17 at 12:52
  • @NtFreX Well, I'm not sure, but when MaxDegreeOfParallelism is specified it will just limit the maximum number of threads, otherwise it's just on auto mode, same for MaxDegreeOfParallelism = -1 value. – Gerald Hughes Jun 09 '17 at 12:54
  • 2
    Without really knowing what `UserAlgorithm.CalculatePercentage` does I'd create an async method that takes a `User` and does that one line of code then in the `foreach` capture each `Task` in a `List` and then wait for all of them to finish before calling `SaveChangesAsync`. But your preformance issues might actually be in the `CalculatePercentage` method. What I wouldn't do is mix `Parallel.ForEach` with async code like this. – juharr Jun 09 '17 at 12:54
  • @KeithNicholas well, I was expecting that a number of iterations may run in parallel, and this way to get a better time, but not worst :) – Gerald Hughes Jun 09 '17 at 12:55
  • @WBuck UserAlgorithm.CalculatePercentage is calling a stored procedure with some data that is obtained from other services that accesses the db – Gerald Hughes Jun 09 '17 at 12:58
  • is it absolutely essential to do this for one user at a time? can it not be wrapped up into one statement/stored proc? – Dave Becker Jun 09 '17 at 12:58
  • @DaveBecker, at the moment yes, unfortunately. – Gerald Hughes Jun 09 '17 at 12:59
  • 1
    Can you please show us the source code for CalculatePercentage ? – mjwills Jun 09 '17 at 13:15
  • @mjwills Sure, I've updated my question – Gerald Hughes Jun 09 '17 at 13:21
  • I see no need for the `ToList` part of this statement: `var users = ctx.Users.Where(u => u.Status != -1).ToList()`. This could indicate a basic wasteful coding style that propagates throughout the code and results in poor performance. Edit: I have not reevaluated, this observation in view of the just updated post. – TnTinMn Jun 09 '17 at 13:21
  • You are creating a connection for every thead. This coasts a lot! – NtFreX Jun 09 '17 at 13:53
  • Can you run CalculatePercentage a few times, and give us statistics about what part of it are slow? https://stackoverflow.com/a/14019526/34092 will help you with that. – mjwills Jun 09 '17 at 14:05
  • Any luck identifying the slow parts of the code @SGN ? – mjwills Jun 10 '17 at 22:28

1 Answers1

2

Since your code is I/O-bound (not CPU-bound), throwing more threads (or parallelism) at the problem isn't going to help. What you want is asynchronous concurrency (Task.WhenAll), not parallel concurrency (Parallel):

using (var ctx = new MyContext())
{
  var users = ctx.Users.Where(u => u.Status != -1);
  var tasks = users.Select(UpdateUserProfilePercentageAsync);
  await Task.WhenAll(tasks);
  await ctx.SaveChangesAsync();
}

private async Task UpdateUserProfilePercentageAsync(User user)
{
  user.ProfilePercentage = await UserAlgorithm.CalculatePercentage(user.Id);
}

You may find out that your database or other services don't appreciate being hammered so hard, in which case you could do some throttling. E.g., for 20 at a time:

private SemaphoreSlim _throttle = new SemaphoreSlim(20);
private async Task UpdateUserProfilePercentageAsync(User user)
{
  await _throttle.WaitAsync();
  try
  {
    user.ProfilePercentage = await UserAlgorithm.CalculatePercentage(user.Id);
  }
  finally { _throttle.Release(); }
}
Stephen Cleary
  • 437,863
  • 77
  • 675
  • 810