1

I am trying to use Entity Framework to perform a merge operation. I have tried using the naive approach of retrieving the records and then doing the logic in the code, but it is very slow.

I have also looked into using the Bulk Insert library as discussed here, but it only supports bulk inserts.

So far my work-around has been to create a stored procedure, define a Table Value Type for my table, dynamically generate the input datatable using reflection, and then execute the stored procedure.

What this means, of course, is that every time my table definition changes, I now have to go in and change the stored procedure code and the table value type as well.

Is there any better way to do this? If I could dynamically generate column definitions from Entity Framework that would already help a lot.

EDIT: I am referring to T-SQL Merge: http://msdn.microsoft.com/en-us/library/bb510625.aspx

Community
  • 1
  • 1
user472875
  • 3,115
  • 4
  • 37
  • 68

2 Answers2

1

UPDATE

Sorry I missed that the question was tagged EF4. This will only work with new version where EF changed their mapping API. I leave it here because the same thing can be done with old api's too, I can find the code for doing that after work

Disclaimer: I'm the author of EFUtilities

EFUtilities https://github.com/MikaelEliasson/EntityFramework.Utilities supports Bulk update now. Out of the box it doesn't let you do the merge part. If you have a way of knowing which entities to insert and which to update you could issue one insert https://github.com/MikaelEliasson/EntityFramework.Utilities#batch-insert-entities and one update https://github.com/MikaelEliasson/EntityFramework.Utilities#batch-update-entities

If you really need the merge It can help you with building the table. You can get access to the metadata I'm using internally by this code.

using (var db = new Context())
{
    var mapping = EfMappingFactory.GetMappingsForContext(db);
    var commentMapping = mapping.TypeMappings[typeof(Comment)];
}

The typeMapping contain all the information I use for dynamically building my temp tables for the bulk update. For some reason I made that code private but here is the method that generates the sql for the table. All info is available on the typeMapping.

    private string BuildCreateTableCommand(string schema, string tempTableName, IEnumerable<ColumnMapping> properties)
    {
        var pkConstraint = string.Join(", ", properties.Where(p => p.IsPrimaryKey).Select(c => "[" + c.NameInDatabase + "]"));
        var columns = properties.Select(c => "[" + c.NameInDatabase + "] " + c.DataType);
        var pk = properties.Where(p => p.IsPrimaryKey).Any() ? string.Format(", PRIMARY KEY ({0})", pkConstraint) : "";

        var str = string.Format("CREATE TABLE {0}.[{1}]({2} {3})", schema, tempTableName, string.Join(", ", columns), pk);

        return str;
    }
Mikael Eliasson
  • 5,157
  • 23
  • 27
0

The best solution probably depends on what exactly you mean by "a merge operation".

If what you mean is that you want to insert entities from the set (table, csv-file or whatever) A which aren't already present in set B, you could

  1. define suitable GetHashCode and Equals methods for your entities
  2. read all existing records in B into memory in one go (assuming they fit in memory), putting them in a HashSet h
  3. read all entities from A, calculate A\h, which should be very fast, and then insert A\h into B, possibly using the bulk insert library.

The key here is to avoid having to do database lookups for each element in A.

Rune
  • 8,340
  • 3
  • 34
  • 47
  • Merge operation: http://msdn.microsoft.com/en-us/library/bb510625.aspx. The bottom line is that if I need to, say, update the timestamp of 10000 rows, and insert 50 more, it takes a very long time. – user472875 Jul 31 '14 at 19:40
  • @user472875 if that is what you want, you may get better answer by tagging your question with tags like 't-sql' and 'entity framework'. – Rune Jul 31 '14 at 19:43
  • Edited the question to clarify intent. Thanks for your input. – user472875 Jul 31 '14 at 19:47