1

we have an application that does some processing at night. Simply put, it creates some statistics for every user (about 10.000).

Now we have noticed that this takes hours in the production environment and we have been able to simulate this using a backup of the production database.

And we see that when the foreach loop starts, it generally takes around 200 ms to generate the data and save it to the database for one user. After about a 1000 users, this gets up to 700 ms per user. And after about 2.000 users, it starts to just take longer and longer, all the way up to 2 seconds to generate the data and save it to the database per user.

Do you have any ideas why this may be? Here is the (simplified) code to show what happens:

var userService = IoC.GetInstance<IUserService>();
var users  = userService.GetAll().Where(x => x.IsRegistered == true);

var statisticsService = IoC.GetInstance<IStatisticsService>();

foreach (var user in users.ToList())
{
   var statistic1 = GetStatistic(1); // this returns an object
   var statistic 2 = GetStatistic(2);

   statisticsService.Add(statistic1);
   statisticsService.Add(statistic2);

   statisticsService.Commit(); /* this is essentially a dbContext.SaveChanges(); */
}

function Statistic GetStatistic(int statisticnumber)
{
   var stat = new Statistic();
   stat.Type = statisticnumber;

   switch(statisticnumber) {
       case 1:
          stat.Value = /* query to count how many times they've logged in */
          break;
       case 2:
          stat.Value = /* query to get their average score */
          break;
       ...
    }
    return stat;
}

So far we have tried:

  1. AsNoTracking: we select the users using AsNoTracking to make sure Entity Framework doesn't track any changes to the user itself (because there are none).
  2. Clearing indexes on statistics table: before this starts we run a script that drops all indexes (except the clustered index). After generation we recreate these indexes

Does anyone have any additional things we can test/try ?

Steven Lemmens
  • 1,441
  • 3
  • 17
  • 30
  • 1
    Try to create a dbContext for every loop (user) – Alberto Jul 10 '17 at 09:15
  • 2
    First thing I would consider is to move the call to `dc.SaveChanges()` outside of the loop. But that does not really explain your problem – Romano Zumbé Jul 10 '17 at 09:16
  • Maybe this answer can help: https://stackoverflow.com/a/10103722/1220550 – Peter B Jul 10 '17 at 09:21
  • 1
    We can't see the actual problem without the code from GetStatistic. Because the problem that will make it that slow isnt in this code – EpicKip Jul 10 '17 at 09:25
  • I agree with Alberto - your db context is accumulating more and more data to keep track of each time through the loop. It ends up with a lot of dead weight - create a new db context every so many iterations. – Adam Benson Jul 10 '17 at 09:27
  • Okay thank you. How do you create a new db context? (Mind you, in the real code, I work through a Service object that has access to the DbContext, while my script doesn't have access directly, but that doesn't matter because I have control of the complete code) – Steven Lemmens Jul 10 '17 at 09:34
  • That does complicate things. How does your Service manage the lifetime of the DbContext? It makes that your current code is not a good sample. – H H Jul 10 '17 at 09:36
  • @EpicKip: I have added my GetStatistic function. It just creates a new object, does one query depending on the parameter and returns the object. – Steven Lemmens Jul 10 '17 at 09:39
  • 1
    In my eyes this whole statistics generation procedure should live in the database only. I would recommend creating a stored procedure in your DB and only trigger it from your application periodically. – Rob Jul 10 '17 at 09:40
  • I would agree with that @Robert: only that the real code does use some application logic that we don't want to duplicate to the database if at all avoidable. – Steven Lemmens Jul 10 '17 at 09:41
  • @StevenLemmens Sadly I don't see anything there. Good luck :) – EpicKip Jul 10 '17 at 09:41
  • You should ensure that a DbContext is only used for a limited number of updates (1 in Alberto's comment, but up to 10 or 100 sounds fine too). Also use NoTracking everywhere, especially in the GetStatistics queries. – H H Jul 10 '17 at 09:43
  • @HenkHolterman: the Service creates a new DbContext in its constructor. And destroys it when the service is destroyed. And I get the service with Ninject (IoC). – Steven Lemmens Jul 10 '17 at 09:44
  • You could try [detaching](https://msdn.microsoft.com/de-de/library/system.data.objects.objectcontext.detach(v=vs.110).aspx) your saved statistics. – Christoph Sonntag Jul 10 '17 at 09:55
  • @StevenLemmens - OK, that ties the DbContext to the service but how do you manage the lifetime of that service? Is it per request, _what is a request_ in relation to this sample? etc. The most crucial part is missing. – H H Jul 10 '17 at 10:30
  • @Compufreak - Yes, untverknupfen is a good idea but a little hacky. Disposing the context every 10 records is much more reliable and scaleable. – H H Jul 10 '17 at 10:33
  • @HenkHolterman: it is per request. The Ninject container is set to create a new service (and thus a new DbContext as well) every time a service is requested. – Steven Lemmens Jul 10 '17 at 11:30
  • What we still need to know: is `IStatisticsService : IDisposable`, which IoC, how is the service configured? Basically, the IoC gets in the way of micro-managing the context here. Consider working around (without) it. Or take up CompuFreaks suggestion for changing that Add() method. – H H Jul 10 '17 at 12:21

2 Answers2

2

As you can see in comments you need to keep the context clean so you need to Dispose it every n records (usually, in my case, n < 1000).
This is a good solution in most cases. But there are some issues, the most important are:
1. When you need to insert a lot of records, running insert (and update) statements runs faster.
2. The entities you write (and related entities) must be all in the same context.

There are some other libraries around to make bulk operations but they works only with SQL Server and I think that a great added value of EF is that is DBMS independent without significant efforts.

When I need to insert several records (less than 1.000.000) and I want to keep EF advantages I use the following methods. They generates a DML statement starting from an entity.

public int ExecuteInsertCommand(object entityObject)
{
    DbCommand command = GenerateInsertCommand(entityObject);
    ConnectionState oldConnectionState = command.Connection.State;
    try
    {
        if (oldConnectionState != ConnectionState.Open)
            command.Connection.Open();
        int result = command.ExecuteNonQuery();
        return result;

    }
    finally
    {
        if (oldConnectionState != ConnectionState.Open)
            command.Connection.Close();
    }
}

public DbCommand GenerateInsertCommand(object entityObject)
{

    ObjectContext objectContext = ((IObjectContextAdapter)Context).ObjectContext;
    var metadataWorkspace = ((EntityConnection)objectContext.Connection).GetMetadataWorkspace();

    IEnumerable<EntitySetMapping> entitySetMappingCollection = metadataWorkspace.GetItems<EntityContainerMapping>(DataSpace.CSSpace).Single().EntitySetMappings;
    IEnumerable<AssociationSetMapping> associationSetMappingCollection = metadataWorkspace.GetItems<EntityContainerMapping>(DataSpace.CSSpace).Single().AssociationSetMappings;

    var entitySetMappings = entitySetMappingCollection.First(o => o.EntityTypeMappings.Select(e => e.EntityType.Name).Contains(entityObject.GetType().Name));

    var entityTypeMapping = entitySetMappings.EntityTypeMappings[0];
    string tableName = entityTypeMapping.EntitySetMapping.EntitySet.Name;

    MappingFragment mappingFragment = entityTypeMapping.Fragments[0];

    string sqlColumns = string.Empty;
    string sqlValues = string.Empty;
    int paramCount = 0;

    DbCommand command = Context.Database.Connection.CreateCommand();

    foreach (PropertyMapping propertyMapping in mappingFragment.PropertyMappings)
    {
        if (((ScalarPropertyMapping)propertyMapping).Column.StoreGeneratedPattern != StoreGeneratedPattern.None)
            continue;

        string columnName = ((ScalarPropertyMapping)propertyMapping).Column.Name;
        object columnValue = entityObject.GetType().GetProperty(propertyMapping.Property.Name).GetValue(entityObject, null);
        string paramName = string.Format("@p{0}", paramCount);

        if (paramCount != 0)
        {
            sqlColumns += ",";
            sqlValues += ",";
        }

        sqlColumns += SqlQuote(columnName);
        sqlValues += paramName;

        DbParameter parameter = command.CreateParameter();
        parameter.Value = columnValue;
        parameter.ParameterName = paramName;
        command.Parameters.Add(parameter);

        paramCount++;
    }

    foreach (var navigationProperty in entityTypeMapping.EntityType.NavigationProperties)
    {
        PropertyInfo propertyInfo = entityObject.GetType().GetProperty(navigationProperty.Name);
        if (typeof(System.Collections.IEnumerable).IsAssignableFrom(propertyInfo.PropertyType))
            continue;

        AssociationSetMapping associationSetMapping = associationSetMappingCollection.First(a => a.AssociationSet.ElementType.FullName == navigationProperty.RelationshipType.FullName);

        EndPropertyMapping propertyMappings = associationSetMapping.AssociationTypeMapping.MappingFragment.PropertyMappings.Cast<EndPropertyMapping>().First(p => p.AssociationEnd.Name.EndsWith("_Target"));

        object relatedObject = propertyInfo.GetValue(entityObject, null);

        foreach (ScalarPropertyMapping propertyMapping in propertyMappings.PropertyMappings)
        {
            string columnName = propertyMapping.Column.Name;
            string paramName = string.Format("@p{0}", paramCount);
            object columnValue = relatedObject == null ?
                null :
                relatedObject.GetType().GetProperty(propertyMapping.Property.Name).GetValue(relatedObject, null);

            if (paramCount != 0)
            {
                sqlColumns += ",";
                sqlValues += ",";
            }

            sqlColumns += SqlQuote(columnName);
            sqlValues += string.Format("@p{0}", paramCount);

            DbParameter parameter = command.CreateParameter();
            parameter.Value = columnValue;
            parameter.ParameterName = paramName;
            command.Parameters.Add(parameter);

            paramCount++;
        }
    }

    string sql = string.Format("INSERT INTO {0} ({1}) VALUES ({2})", tableName, sqlColumns, sqlValues);
    command.CommandText = sql;

    foreach (DbParameter parameter in command.Parameters)
    {
        if (parameter.Value == null)
            parameter.Value = DBNull.Value;
    }

    return command;
}

public int ExecuteUpdateCommand(object entityObject)
{
    DbCommand command = GenerateUpdateCommand(entityObject);
    ConnectionState oldConnectionState = command.Connection.State;
    try
    {
        if (oldConnectionState != ConnectionState.Open)
            command.Connection.Open();
        int result = command.ExecuteNonQuery();
        return result;
    }
    finally
    {
        if (oldConnectionState != ConnectionState.Open)
            command.Connection.Close();
    }
}

public DbCommand GenerateUpdateCommand(object entityObject)
{

    ObjectContext objectContext = ((IObjectContextAdapter)Context).ObjectContext;
    var metadataWorkspace = ((EntityConnection)objectContext.Connection).GetMetadataWorkspace();

    IEnumerable<EntitySetMapping> entitySetMappingCollection = metadataWorkspace.GetItems<EntityContainerMapping>(DataSpace.CSSpace).Single().EntitySetMappings;
    IEnumerable<AssociationSetMapping> associationSetMappingCollection = metadataWorkspace.GetItems<EntityContainerMapping>(DataSpace.CSSpace).Single().AssociationSetMappings;

    string entityTypeName;
    if (!entityObject.GetType().Namespace.Contains("DynamicProxi"))
        entityTypeName = entityObject.GetType().Name;
    else
        entityTypeName = entityObject.GetType().BaseType.Name;
    var entitySetMappings = entitySetMappingCollection.First(o => o.EntityTypeMappings.Select(e => e.EntityType.Name).Contains(entityTypeName));

    var entityTypeMapping = entitySetMappings.EntityTypeMappings[0];
    string tableName = entityTypeMapping.EntitySetMapping.EntitySet.Name;

    MappingFragment mappingFragment = entityTypeMapping.Fragments[0];

    string sqlColumns = string.Empty;
    int paramCount = 0;

    DbCommand command = Context.Database.Connection.CreateCommand();

    foreach (PropertyMapping propertyMapping in mappingFragment.PropertyMappings)
    {
        if (((ScalarPropertyMapping)propertyMapping).Column.StoreGeneratedPattern != StoreGeneratedPattern.None)
            continue;

        string columnName = ((ScalarPropertyMapping)propertyMapping).Column.Name;

        if (entityTypeMapping.EntityType.KeyProperties.Select(_ => _.Name).Contains(columnName))
            continue;

        object columnValue = entityObject.GetType().GetProperty(propertyMapping.Property.Name).GetValue(entityObject, null);
        string paramName = string.Format("@p{0}", paramCount);

        if (paramCount != 0)
            sqlColumns += ",";

        sqlColumns += string.Format("{0} = {1}", SqlQuote(columnName), paramName);

        DbParameter parameter = command.CreateParameter();
        parameter.Value = columnValue ?? DBNull.Value;
        parameter.ParameterName = paramName;
        command.Parameters.Add(parameter);

        paramCount++;
    }

    foreach (var navigationProperty in entityTypeMapping.EntityType.NavigationProperties)
    {
        PropertyInfo propertyInfo = entityObject.GetType().GetProperty(navigationProperty.Name);
        if (typeof(System.Collections.IEnumerable).IsAssignableFrom(propertyInfo.PropertyType))
            continue;

        AssociationSetMapping associationSetMapping = associationSetMappingCollection.First(a => a.AssociationSet.ElementType.FullName == navigationProperty.RelationshipType.FullName);

        EndPropertyMapping propertyMappings = associationSetMapping.AssociationTypeMapping.MappingFragment.PropertyMappings.Cast<EndPropertyMapping>().First(p => p.AssociationEnd.Name.EndsWith("_Target"));

        object relatedObject = propertyInfo.GetValue(entityObject, null);


        foreach (ScalarPropertyMapping propertyMapping in propertyMappings.PropertyMappings)
        {
            string columnName = propertyMapping.Column.Name;
            string paramName = string.Format("@p{0}", paramCount);
            object columnValue = relatedObject == null ?
                null :
                relatedObject.GetType().GetProperty(propertyMapping.Property.Name).GetValue(relatedObject, null);

            if (paramCount != 0)
                sqlColumns += ",";

            sqlColumns += string.Format("{0} = {1}", SqlQuote(columnName), paramName);

            DbParameter parameter = command.CreateParameter();
            parameter.Value = columnValue ?? DBNull.Value;
            parameter.ParameterName = paramName;
            command.Parameters.Add(parameter);

            paramCount++;
        }
    }


    string sqlWhere = string.Empty;
    bool first = true;
    foreach (EdmProperty keyProperty in entityTypeMapping.EntityType.KeyProperties)
    {
        var propertyMapping = mappingFragment.PropertyMappings.First(p => p.Property.Name == keyProperty.Name);
        string columnName = ((ScalarPropertyMapping)propertyMapping).Column.Name;
        object columnValue = entityObject.GetType().GetProperty(propertyMapping.Property.Name).GetValue(entityObject, null);
        string paramName = string.Format("@p{0}", paramCount);

        if (first)
            first = false;
        else
            sqlWhere += " AND ";

        sqlWhere += string.Format("{0} = {1}", SqlQuote(columnName), paramName);

        DbParameter parameter = command.CreateParameter();
        parameter.Value = columnValue;
        parameter.ParameterName = paramName;
        command.Parameters.Add(parameter);

        paramCount++;

    }


    string sql = string.Format("UPDATE {0} SET {1} WHERE {2}", tableName, sqlColumns, sqlWhere);
    command.CommandText = sql;

    return command;
}
bubi
  • 6,414
  • 3
  • 28
  • 45
0

Add Method

This method is getting slower and slower after every iteration. In fact, this method doesn't get slower but the DetectChanges methods that get called inside the Add method.

So more record the ChangeTracker contains, slower the DetectChanges method become.

At around 100,000 entities, it can get more than 200ms to simply add a new entity when it was taking 0ms when the first entity was added.

Solution

There are several solutions to fix this issue such as:

  • USE AddRange over Add
  • SET AutoDetectChanges to false
  • SPLIT SaveChanges in multiple batches

In your case, probably re-creating a new context every time you loop can be the best idea since it looks you want to save on every iteration.

foreach (var user in users.ToList())
{
   var statisticsService = new Instance<IStatisticsService>();

   var statistic1 = GetStatistic(1); // this returns an object
   var statistic 2 = GetStatistic(2);

   statisticsService.Add(statistic1);
   statisticsService.Add(statistic2);

   statisticsService.Commit(); /* this is essentially a dbContext.SaveChanges(); */
}

Once you get rid of the poor performance due to the DetectChanges method, you still have a performance issue caused by the number of database round-trip performed by the SaveChanges methods.

If you need to save 10,000 statistics, then SaveChanges will make 10,000 database round-trip which is INSANELY slow.

Disclaimer: I'm the owner of the project Entity Framework Extensions

This library allows you to perform all bulk operations:

  • BulkSaveChanges
  • BulkInsert
  • BulkUpdate
  • BulkDelete
  • BulkMerge
  • BulkSynchronize

It works will all major provider such:

  • SQL Server
  • SQL Compact
  • Oracle
  • MySQL
  • SQLite
  • PostgreSQL

Example:

// Easy to use
context.BulkSaveChanges();

// Easy to customize
context.BulkSaveChanges(bulk => bulk.BatchSize = 100);

// Perform Bulk Operations
context.BulkDelete(customers);
context.BulkInsert(customers);
context.BulkUpdate(customers);

// Customize Primary Key
context.BulkMerge(customers, operation => {
   operation.ColumnPrimaryKeyExpression = 
        customer => customer.Code;
});
Jonathan Magnan
  • 10,874
  • 2
  • 38
  • 60