33

The same topic was discussed here 8 months ago: How do I speed up DbSet.Add()?. There was no solution proposed other than using SqlBulkCopy which is not acceptable for us. I've decided to bring it up once again hoping there might be new thoughts and ideas around this issue and other workarounds are proposed. At least I'm just curious why this operation takes so long time to run.

Well, the problem is: I have to update 30K entities into database (EF 4.1, POCO). The entity type is quite simple containing integer Id + other 4 integer properties with no relations to other types. 2 cases:

  • all them are new records. Running context.Entities.Add(entity) one by one for every entity takes 90 seconds with Cntx.Configuration.AutoDetectChangesEnabled=false (true value makes it run forever). Then SaveChanges takes just a second. Other approach: attaching it to the context like this takes the same 90 sec:

    Cntx.Entities.Attach(entity);
    Cntx.Entry(entity).State = EntityState.Added;
    
  • all them are existing records with some changes. In the case it takes just few milliseconds to attach it to existing data context like this:

    Cntx.Entities.Attach(entity);
    Cntx.Entry(entity).State = EntityState.Modified;
    

    See the difference?

What is behind the scene of Add method that makes it work so incredibly slow?

Community
  • 1
  • 1
YMC
  • 4,925
  • 7
  • 53
  • 83

1 Answers1

27

I've got interesting performance testing results and I've found a culprit. I have not seen any information like this in any EF source I've ever read.

It turns out to be Equals overridden in a base class. The base class supposed to contain Id property shared between all types of concrete entities. This approach recommended by many EF books and pretty well know. You can find it here for example: How to best implement Equals for custom types?

More exactly, performance is killed by unboxing operation (object to concrete type conversion) that made it work so slow. As I commented this line of code it took 3 sec to run opposing to 90 sec before!

public override bool Equals ( object obj )
{
    // This line of code made the code so slow 
    var entityBase = obj as EntityBase;
    ...
}

As I found it I started thinking over what might be an alternative to this Equals. First idea was to implement IEquatable for EntityBase, but it happened not to be run at all. So what I decided finally to do is to implement IEquatable for each concrete entity class in my model. I have only few of them, so it's minor update for me. You can put whole Equal operation functionality (usually it is 2 object Ids comparison) into extension method to share between concrete entity classes and run it like this: Equal((EntityBase)ConcreteEntityClass). The most interesting, this IEquatable speeds up EntitySet.Add 6 times!

So I have no more issues with performance, the same code runs for me with less than a second. I got 180 times performance gain! Amazing!

Conclusion:

  1. the most fast way to run EntitySet.Add is to have IEquatable for the specific entity (0.5 sec)
  2. Missing IEquatable makes it run 3 sec.
  3. Having Equals(object obj) which most sources recommend makes it run 90 sec
Community
  • 1
  • 1
YMC
  • 4,925
  • 7
  • 53
  • 83
  • 2
    @YMC: Could you elaborate on how to implement `IEquatable`? I'm confused about whether to override object.Equals() for a mutable type (because you're then also supposed to override object.GetHashCode(), but if the hash code changes while the object is in a dictionary, it becomes orphaned). After the object is persisted I can use the primary key from the DB, but before it's persisted, all new objects have a key of 0. I posed this as a separate question but did not receive a good response yet http://stackoverflow.com/questions/9782235/implement-iequatable-for-poco – Eric J. Mar 20 '12 at 18:02
  • @YMC I'm also hoping you elaborate on the `IEquatable` implementation. I'm running into the same problem. – vlad Jun 21 '12 at 21:24
  • vlad and Eric, I have not access to the code right now, but I think you can use at least 2 approaches: 1) Guid to generate primary key values and GetHashCode() implementation might be as simple as something like `return Id.GetHashCode()` 2) In case of integer Id, it might look like this `IsNew ? base.GetHashCode() : Id.GetHashCode()`, IsNew returns true if Id==0. Correct me if there something wrong in the appoach – YMC Jun 22 '12 at 01:27
  • woah... weird stuff happening here... my code went from 10seconds to 0.1... cheesus – Mark Segal Dec 05 '12 at 18:54
  • 2
    I tried this (implemented `IEquatable`) and it had no effect on the speed of the code. Also, you talk about *unboxing* but there is no unboxing in the code you quoted. In your list of conclusions, option #2 is not clear: it only states what it is *missing* but not what it *is*. Perhaps some clear instructions on what to do would be useful. – Timwi May 31 '13 at 14:14