Is there a strategy or design pattern for making large-scale graph clone operations in Entity Framework/EF Core?

Question

Essentially, we have a database with a recurring template pattern and instances of this template. Templates live indefinitely, while the instances are bound in time. One group of users work only with templates and one group of users work only with "answer" entities connected to the instances. When a change is made to the template, the instances that are currently active automatically receive the changes from the templates (including cloning related entities or bringing existing clones into sync), while older instances are left alone "as you left them", which is an absolute requirement in order to not retroactively change history. When you go back to 2013, you want to see the data that was current as of the last change in 2013, not anything newer. Thus the cloning.

This all sounds good, except that making the clone involves cloning an involved graph of entities, sometimes including many-to-many relationships. Making sure that the information of the just-updated version of the template is used involves passing around that specific as-yet-unsaved entity object or saving at every step, forgetting all objects and making a new context every time. This code is hard to write, harder to get right and a nightmare to maintain.

I have desperately been looking for suitable literature about this and have been unable to even find something written up about the database modelling pattern (or for that matter better alternatives), never mind what to do in EF to work as efficiently as possible. Am I missing something, or is this just a case of it being a problem with inherent complexity?

To me this is "too broad". You should provide an example that demonstrates the main problems you try to tackle, Cloning in EF can be very simple (simply `Add()` an object graph root), but I can't piece together from your description how this would fit in. — Gert Arnold, Aug 12 '16 at 14:10

score 0 · Answer 1 · answered Aug 12 '16 at 13:23

0

There is nothing built in to help with this specific scenario. I'd consider a solution based on reflection and on the entity framework metadata model to automate a lot of this. That makes it easier to get right as well.

Cloning graph of objects should be automatable and has little inherent complexity. But if you want to clone only specific parts I can see complexity creep in easily. That's likely going to be inherent complexity. On the other hand if you find yourself writing the same cloning code and copy loops all over the place that's a missed abstraction and is artificial complexity.

Making sure that the information of the just-updated version of the template is used involves passing around that specific as-yet-unsaved entity object or saving at every step, forgetting all objects and making a new context every time.

I did not quite understand what you mean here. But talking about multiple contexts makes me very alert because that's a common anti-pattern. Normally, you want to have one context per logical unit of work. Often, that UOW is an HTTP request or a WCF request or a user interaction. When all entities are part of the same context many issues go away.

Also, it's not necessary to keep objects unsaved. Generally, the database should be synchronized with the in-memory entity state. So when you create fresh objects as part of your template cloning procedure there should be no reason to not save them. It's not necessary to save after each new entity. For performance reasons try not to save too often.

If you elaborate more on specific issues I can add commentary.

answered Aug 12 '16 at 13:23

usr

168,620
35
240
369

Regarding many contexts, primarily I mean [this issue](http://stackoverflow.com/questions/699648/entity-framework-re-finding-objects-recently-added-to-context), where you can't easily "re-find" objects added to a dbset. Since we are trying to create fused "create-or-update" methods, this issue shows up more often than you might think. In EF6 we had `.Local` on every set to work around this. We're currently trying to use EF Core, but may have to go to EF6 just to get enough features back. – Jesper Aug 12 '16 at 13:35
We have attempted levels of automating this, and it seems like the likely solution, however we also have relations that should absolutely not be cloned which as you say complicates things. The current solution is at least contained enough to only populating the clones in one way, in one method, and there's only "one loop". I have seen one or two examples of reflection-based cloning, but none that showed any signs of being tested beyond very simple parent-child data or dealt with "upserting" clones, that is to say updating existing ones. – Jesper Aug 12 '16 at 13:41
A concrete example of where "saving midpoint" solves a problem is when a Law entity's relations to a number of LawCategory entities (many-to-many) has been changed. First the entity is updated as such and the many-to-many table's entries are updated. But at the update-all-the-clones stage, the abstracted clone-many-to-many logic is not able to see these new entries with an existing context if a save is not made, because it tries to query these entries from the many-to-many table's entity dbset directly. (continued...) – Jesper Aug 12 '16 at 13:57
(continued) It would not need to do this if it could go via the relation collection on the entity object itself, but EF Core does not allow for lazy loading and this would put an extra burden on us to make sure we are updating these entries through the right relation/dbset that the rest of the code is expecting. But neither do we want to rely on "saving midpoint", in case that we would at some point have a temporarily invalid set of data. – Jesper Aug 12 '16 at 13:59
Yes, being able to "see" new entities is a good reason to save. You can even adopt the pattern to always save after creating an entity. Lazily loaded collections are problematic because they can get out of date (scalar properties of entity types cannot get out of date because you're writing to them). I don't use entity collections at all. I always write a query.; Rather than using Local I recommend not relying on collection navigation properties at all. Often, you need some filter, join or ordering anyway. Navigation collections cannot execute any of that in the DB. – usr Aug 12 '16 at 14:04

Is there a strategy or design pattern for making large-scale graph clone operations in Entity Framework/EF Core?

1 Answers1