1

I am working with detached entities and sending them to a TPL DataFlow pipeline. The point of concern is the first TransformBlock.

This is occurring when I increase the parallelism of the TransformBlock to greater than 1. I have not observed the issues when testing with a max degree of parallelism = 1.

The TransformBlock input and output object is a document. The purpose of it is to make some api calls, update the document object, save it to the database and return it for further processing in the pipeline.

// MaxDegreeOfParallelism = 20
DocumentCreationTransformBlock = new TransformBlock<Document, Document>(async document =>
{                                                      
    await _createDocumentAction.DoCreateActionAsync(document);                       
    return document;
});

The method DoCreateActionAsync instantiates a new DbContext and manually controls attaching entites and updating the modified properties.

// pass in a document object which is an Entity object but is 'disconnected' (no tracked changes)
public async Task<bool> DoCreateActionAsync(Document document)
{
    // Make some changes to the detached 'document' object properties
    document.Attempt++;

    // make the REST API call 
    await CreateAsync(document);

    document.Result = 1;
    document.NewId = '1111100000';

    using (MyEntities context = new MyEntities())
    {
        // 1. Attach the document to this context. It attaches with the EntityState = Unchanged.
        context.Documents.Attach(document);

        // 2. Manually set properties we know we have edited to IsModified = true
        context.Entry(document).Property(u => u.Attempt).IsModified = true;
        context.Entry(document).Property(u => u.Result).IsModified = true;
        context.Entry(document).Property(u => u.NewId).IsModified = true;

        // 3. Updated version 1 in the collection navigation property DocumentVersions
        // DocumentVersions have previously been Eagerly Loaded, so no database query occurs here.
        var version1 = document.DocumentVersions.OrderBy(v => v.DocumentVersion).FirstOrDefault(); 
        if (version1 != null)
        {
            context.Entry(version1).Property(u => u.VersionCreated).IsModified = true;
        }            

        context.SaveChanges();
    }
}

From the above code I expect the call to SaveChanges() to generate:

  • 1 SQL update statement for the Document object

  • 1 SQL update statement for the DocumentVersion object

I logged out the entities in the DBContext.ChangeTracker.Entries() list that are modified along with any properties that are modified.

In this example I was updating Document id = 8 (and it's associated DocumentVersion id = 12)

#EntityType - #EntityState - #EntityKey
Document - Modified - Id:3
    Result
    NewId
Document - Modified - Id:8
    Attempt
    Result
    NewId
DocumentVersion - Modified - Id:6
    VersionCreated
DocumentVersion - Modified - Id:12
    VersionCreated

The entity I want to update was always being updated, but randomly (different results on every test run), additional documents were being updated.

I have re-run the code and every time different documents are affected in this way. Sometimes up to 3 documents are in the Change Tracking list. The only consistent thing is that at a minimum, the entities I want to be tracked are being tracked. The 'extra' entities being tracked here are also being updated on different threads as they are passed to the DoCreateActionAsync method.

How are the extra entities attaching themselves to this DbContext?

SeanOB
  • 752
  • 5
  • 24
  • Check this out: [Is DbContext thread safe?](https://stackoverflow.com/questions/6126616/is-dbcontext-thread-safe) – Theodor Zoulias Sep 23 '19 at 13:01
  • @TheodorZoulias the answer to that questions is "Simply create a new instance of DbContext in you thread." As you can see I am not sharing instances of my DbContext. – SeanOB Sep 23 '19 at 13:08
  • You are right. So the problem must be something else. Can you post the source code of the method `CreateAsync`? – Theodor Zoulias Sep 23 '19 at 13:24
  • You are not feeding some documents twice to the pipeline? (I see you update document.Attempt, that is why I ask. Otherwise try detaching them once you are done with them. – Peter Bons Sep 23 '19 at 13:55
  • @PeterBons It doesn't appear to be passing any documents twice to the pipeline. I confirm this by logging the doc id as they are entered into the pipeline. I tried testing detaching the entities at the end of this DbContext using statement. However, it sets all navigation properties (such as DocumentVersions) to null. I still need that data in memory for the next pipeline blocks. – SeanOB Sep 23 '19 at 14:29
  • @TheodorZoulias The full code for that method is quite long. There is no reference to DbContext in it. It basically makes the REST API call and then sets document.Result and document.NewId (which I pulled out of that method and showed in my example for clarity). So you can ignore the await CreateAsync line in my sample above. – SeanOB Sep 23 '19 at 14:34
  • Adding `context.Configuration.AutoDetectChangesEnabled = false;` as the first line inside the using statement is appearing to resolve the issue. I still don't understand why my context was automatically picking up changes to unrelated entities (entities that were being processed elsewhere in the pipeline). It's like it is trying to consolidate changes from different contexts into fewer transaction. – SeanOB Sep 24 '19 at 05:17

0 Answers0