I am working with detached entities and sending them to a TPL DataFlow pipeline. The point of concern is the first TransformBlock
.
This is occurring when I increase the parallelism of the TransformBlock
to greater than 1. I have not observed the issues when testing with a max degree of parallelism = 1.
The TransformBlock
input and output object is a document. The purpose of it is to make some api calls, update the document object, save it to the database and return it for further processing in the pipeline.
// MaxDegreeOfParallelism = 20
DocumentCreationTransformBlock = new TransformBlock<Document, Document>(async document =>
{
await _createDocumentAction.DoCreateActionAsync(document);
return document;
});
The method DoCreateActionAsync
instantiates a new DbContext
and manually controls attaching entites and updating the modified properties.
// pass in a document object which is an Entity object but is 'disconnected' (no tracked changes)
public async Task<bool> DoCreateActionAsync(Document document)
{
// Make some changes to the detached 'document' object properties
document.Attempt++;
// make the REST API call
await CreateAsync(document);
document.Result = 1;
document.NewId = '1111100000';
using (MyEntities context = new MyEntities())
{
// 1. Attach the document to this context. It attaches with the EntityState = Unchanged.
context.Documents.Attach(document);
// 2. Manually set properties we know we have edited to IsModified = true
context.Entry(document).Property(u => u.Attempt).IsModified = true;
context.Entry(document).Property(u => u.Result).IsModified = true;
context.Entry(document).Property(u => u.NewId).IsModified = true;
// 3. Updated version 1 in the collection navigation property DocumentVersions
// DocumentVersions have previously been Eagerly Loaded, so no database query occurs here.
var version1 = document.DocumentVersions.OrderBy(v => v.DocumentVersion).FirstOrDefault();
if (version1 != null)
{
context.Entry(version1).Property(u => u.VersionCreated).IsModified = true;
}
context.SaveChanges();
}
}
From the above code I expect the call to SaveChanges()
to generate:
1 SQL update statement for the Document object
1 SQL update statement for the DocumentVersion object
I logged out the entities in the DBContext.ChangeTracker.Entries()
list that are modified along with any properties that are modified.
In this example I was updating Document id = 8 (and it's associated DocumentVersion id = 12)
#EntityType - #EntityState - #EntityKey
Document - Modified - Id:3
Result
NewId
Document - Modified - Id:8
Attempt
Result
NewId
DocumentVersion - Modified - Id:6
VersionCreated
DocumentVersion - Modified - Id:12
VersionCreated
The entity I want to update was always being updated, but randomly (different results on every test run), additional documents were being updated.
I have re-run the code and every time different documents are affected in this way. Sometimes up to 3 documents are in the Change Tracking list. The only consistent thing is that at a minimum, the entities I want to be tracked are being tracked. The 'extra' entities being tracked here are also being updated on different threads as they are passed to the DoCreateActionAsync
method.
How are the extra entities attaching themselves to this DbContext
?