42

What is the best way to deal with batch updates using (Entity Framework) EF5? I have 2 particular cases I'm interested in:

  1. Updating a field (e.g. UpdateDate) for a list (List) of between 100 and 100.000 Id's, which the primary key. Calling each update separately seem to be to much overhead and takes a long time.

  2. Inserting many, also between the 100 and 100.000, of the same objects (e.g. Users) in a single go.

Any good advice?

Frank
  • 3,959
  • 4
  • 19
  • 24

7 Answers7

37
  1. There are two open source projects allowing this: EntityFramework.Extended and Entity Framework Extensions. You can also check discussion about bulk updates on EF's codeplex site.
  2. Inserting 100k records through EF is in the first place wrong application architecture. You should choose different lightweight technology for data imports. Even EF's internal operation with such big record set will cost you a lot of processing time. There is currently no solution for batch inserts for EF but there is broad discussion about this feature on EF's code plex site.
Ladislav Mrnka
  • 360,892
  • 59
  • 660
  • 670
22

I see the following options:

1 . The simplest way - create your SQL request by hands and execute through ObjectContext.ExecuteStoreCommand

context.ExecuteStoreCommand("UPDATE TABLE SET FIELD1 = {0} WHERE FIELD2 = {1}", value1, value2);

2 . Use EntityFramework.Extended

context.Tasks.Update(
    t => t.StatusId == 1, 
    t => new Task {StatusId = 2});

3 . Make your own extension for EF. There is an article Bulk Delete where this goal was achieved by inheriting ObjectContext class. It's worth to take a look. Bulk insert/update can be implemented in the same way.

Alex Klaus
  • 8,168
  • 8
  • 71
  • 87
3

You may not want to hear it, but your best option is to not use EF for bulk operations. For updating a field across a table of records, use an Update statement in the database (possibly called through a stored proc mapped to an EF Function). You can also use the Context.ExecuteStoreQuery method to issue an Update statement to the database.

For massive inserts, your best bet is to use Bulk Copy or SSIS. EF will require a separate hit to the database for each row being inserted.

Jim Wooley
  • 10,169
  • 1
  • 25
  • 43
2

Bulk inserts should be done using the SqlBulkCopy class. Please see pre-existing StackOverflow Q&A on integrating the two: SqlBulkCopy and Entity Framework

SqlBulkCopy is a lot more user-friendly than bcp (Bulk Copy command-line utility) or even OPEN ROWSET.

Community
  • 1
  • 1
John Zabroski
  • 2,212
  • 2
  • 28
  • 54
1

Here's what I've done successfully:

private void BulkUpdate()
{
    var oc = ((IObjectContextAdapter)_dbContext).ObjectContext;
    var updateQuery = myIQueryable.ToString(); // This MUST be above the call to get the parameters.
    var updateParams = GetSqlParametersForIQueryable(updateQuery).ToArray();
    var updateSql = $@"UPDATE dbo.myTable
                       SET col1 = x.alias2
                       FROM dbo.myTable
                       JOIN ({updateQuery}) x(alias1, alias2) ON x.alias1 = dbo.myTable.Id";
    oc.ExecuteStoreCommand(updateSql, updateParams);
}

private void BulkInsert()
{
    var oc = ((IObjectContextAdapter)_dbContext).ObjectContext;
    var insertQuery = myIQueryable.ToString(); // This MUST be above the call to get the parameters.
    var insertParams = GetSqlParametersForIQueryable(insertQuery).ToArray();
    var insertSql = $@"INSERT INTO dbo.myTable (col1, col2)
                       SELECT x.alias1, x.alias2
                       FROM ({insertQuery}) x(alias1, alias2)";
    oc.ExecuteStoreCommand(insertSql, insertParams.ToArray());
}    

private static IEnumerable<SqlParameter> GetSqlParametersForIQueryable<T>(IQueryable<T> queryable)
{
    var objectQuery = GetObjectQueryFromIQueryable(queryable);
    return objectQuery.Parameters.Select(x => new SqlParameter(x.Name, x.Value));
}

private static ObjectQuery<T> GetObjectQueryFromIQueryable<T>(IQueryable<T> queryable)
{
    var dbQuery = (DbQuery<T>)queryable;
    var iqProp = dbQuery.GetType().GetProperty("InternalQuery", BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public);
    var iq = iqProp.GetValue(dbQuery, null);
    var oqProp = iq.GetType().GetProperty("ObjectQuery", BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public);
    return (ObjectQuery<T>)oqProp.GetValue(iq, null);
}
adam0101
  • 29,096
  • 21
  • 96
  • 174
1
    public static bool BulkDelete(string tableName, string columnName, List<object> val)
    {
        bool ret = true;

        var max = 2000;
        var pages = Math.Ceiling((double)val.Count / max);
        for (int i = 0; i < pages; i++)
        {
            var count = max;
            if (i == pages - 1) { count = val.Count % max; }

            var args = val.GetRange(i * max, count);
            var cond = string.Join("", args.Select((t, index) => $",@p{index}")).Substring(1);
            var sql = $"DELETE FROM {tableName} WHERE {columnName} IN ({cond}) ";

            ret &= Db.ExecuteSqlCommand(sql, args.ToArray()) > 0;
        }

        return ret;
    }
Jack CQ
  • 21
  • 2
0

I agree with the accepted answer that ef is probably the wrong technology for bulk inserts. However, I think it's worth having a look at EntityFramework.BulkInsert.

Lukas Winzenried
  • 1,919
  • 1
  • 14
  • 22