0

The project is a .net core 7.0 web api using entity framework core and DI.

I have two instances of DbContext, one instance checks to see if the entity or record is created, if it isn't it calls a method in a class to create that record and pass the Id. Then it searches for the record again using the Id to update it.

I noticed that if the dbContext instance that creates the record wasn't disposed correctly, when I update properties in the second instance for the same record, it doesn't actually update it. It seems to be updating the other instance again (I think).

Due to dbContext having concurrency constraints, I specifically created two separate instances, to have this separation of concerns.

First class with dbContext that is injected via Dependency Injection

if (study is null)
            {
                var logMessage = $"Unable to find {StudyInstanceUid} in the database. Creating a new study and marking it as an exception.";
                _logger.LogWarning(logMessage);

                var studyId = await _cs.CreateStudy(myStudy);
                study = await (from s in _cdb.Studies where s.Id == studyId select s).FirstAsync();
            }

Service class that has the create methods

public async Task<int> CreateStudy(studyNotification record)
        {
            var _optionsBuilder = new DbContextOptionsBuilder<MyDbContext>();
            _optionsBuilder.UseSqlServer(_config.GetConnectionString("MyDb"));
            using var _cdb = new MyDbContext(_optionsBuilder.Options);

            var study = new Study()
            {
                ...
            };


            _cdb.Studies.Add(study);
            await _cdb.SaveChangesAsync();
            return study.Id;
        }

If I modify the code above to:

            await _cdb.SaveChangesAsync();
            int id = study.Id;
            return id;

It works as expected. While I do not understand the inner workings of entity framework core, I would have thought that the two different instances would not interfere with each other. I would like to understand why this issue occurs?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
sridawg
  • 130
  • 2
  • 9
  • One issue i see right away is that your service method is creating an instance of your dbcontext. dbcontext is a unit of work. In a webapi the unit of work is the request. Add a dependency to the context to your controller and pass the instance to your service(s) so that you are using the context. that way everything under the covers is using the same unit of work. Use the DI helper method AddDbContext to add a scoped dbcontext so that your controllers can use them. this will create a dbcontext that is scoped to the request and disposed of properly when the response is returned. – Fran Apr 11 '23 at 19:59
  • That is definitely an approach that I have used before and no doubt would resolve this issue, but the service class is part of a library. For that reason alone I think it should be able to perform data operations independently. – sridawg Apr 11 '23 at 22:16

1 Answers1

1

There should be no difference between:

await _cdb.SaveChangesAsync();
return study.Id;

and

await _cdb.SaveChangesAsync();
int id = study.Id;
return id;

Differences you will see between entities loaded by different DbContexts and the timing of events relative to one another will often come down to tracked instances.

If a DbContext instance happens to have loaded an instance of a Study (say Id = 5) either directly, or indirectly as the result of another data read, by default that entity instance will be tracked (cached) by the DbContext instance. If another new DbContext instance goes to load that Study ID #5, it will load that record from the database. Say that 2nd DbContext modifies and saves that record to the database. If the first DbContext tries to read it again using something like:

var study = context.Studies.Single(x => x.StudyId == studyId);

You might expect that you'll get the updated Study record from the database... With a profiler attached you'll even see EF execute a query against the database, but what you will get back is the cached, tracked instance that DbContext already has, taken before the 2nd DbContext made changes.

The simplest way to ensure that you get the current database state when fetching an entity is to tell EF explicitly not to track the entity. This means EF won't add it to the tracking cache, and more importantly, it won't read it from the tracking cache either.

var study = context.Studies.AsNoTracking().Single(x => x.StudyId == studyId);

Whenever working with data that can be concurrently modified, either between threads in the application or external processes / users, it is important to either use fresh DbContext instances for operations, or use non-tracking queries to ensure that each query takes the data state each time as the source of truth. This can be done with AsNoTracking() like above, or by basing read operations on Projections using Select or ProjectTo. (Automapper) Tracking queries should be reserved solely for situations where you want to update data, and completed as quickly as possibly.

Edit: If you do want to update the entity and use change tracking, but ensure that the entity in that case is up to date, then you can check the local cache and reload the entity if found, otherwise ensure that the retrieved entity is coming from the database. If concurrent editing is a significant factor in your application(s) then I would also suggest implementing a concurrency marker in the relevant tables such as a Timestamp or RowVersion to inspect and help guard against stale updates.

// Check the local cache, if found, issue a reload to ensure it is up to date:
var study = _cdb.Studies.Local.FirstOrDefault(x => x.StudyId == studyId);
if (study != null)
    _cdb.Entry(study).Reload();
else
    study = _cdb.Studies.Single(x => x.StudyId == studyId);

The main issue with this approach is that it will work just fine so long as you only want the Study entity and not rely on related data being present since we don't know that the tracked Study instance has all or any related data loaded. For instance if a Study has a collection of references and we'd normally do something like:

else
    study = _cdb.Studies.Include(x => x.References).Single(x => x.StudyId == studyId);

... to ensure the references are loaded, the problem is that if we do find a local study instance cached we cannot safely assume that the code that might have loaded it also eager loaded the references. There may be References in the collection, but only ones that the DbContext also happened to be tracking. In this case it would be safer to detach any tracked entity and re-load it:

var study = _cdb.Studies.Local.FirstOrDefault(x => x.StudyId == studyId);
if (study != null)
    _cdb.Entry(study).State = EntityState.Detached;

study = _cdb.Studies
   .Include(x => x.References)
   .Single(x => x.StudyId == studyId);

This checks the cache, if found, removes the study from the cache, and the next statement will load the study and related data from the database.

Steve Py
  • 26,149
  • 3
  • 25
  • 43
  • This definitely puts me in the right direction, however AsNoTracking() does not allow me to actually update the entity after retrieving it which is a requirement. I searched for options on forcing a read from the database, and found: https://stackoverflow.com/questions/22177491/how-to-force-entity-framework-to-always-get-updated-data-from-the-database. Consequently it mentions the use of Reload, and others have also mentioned the need to dispose of instances. I assume when I have a hanging reference to an object it doesn't dispose consistently. – sridawg Apr 11 '23 at 11:03
  • Yes, Reload can be used when faced with a possibly stale reference to ensure it has the current data state. Hanging references would only be an issue if they were associated and tracked by the DbContext in question. I will update the answer to include checking the local cache & Reload to consider where you do want change tracking. – Steve Py Apr 11 '23 at 21:05