I am working on an application, trying to improve performance. Obviously I will be doing my own profiling and testing, but I would like to know if there is a "consensus" or known best practice.
In the old SQL days, one of the main things to do to improve efficiency was to not select data you aren't going to consume. I'm trying to go down that route with EF6.
In this particular case, I have a master-detail-detail relationship where I need to render some data about the parent, child, and grandchild on the screen.
My application is n-tier with an MVC front end, and a web-api REST backend. These entities are ultimately going to be serialized as JSON, sent over the rest connection back to the MVC controller, where they will be rendered to the screen. In this case I will not be updating the entities from this flow, so I don't need to worry about merging partial entities back into the repository (in those cases, I would probably send over the full entity for ease of maintenance)
So, the original straightforward EF code I wrote looks like this
Repository.GetAll()
.AsNoTracking()
.Include("Children")
.Include("Children.GrandChildren")
.ToList();
However, I am only actually consuming a subset of the properties of these entities, and some of the unused properties can be rather large (big chunks of XML, etc)
Here is a first pass at trying to project out only the fields I need (for the example here, I have cut out and renamed most of the fields I would actually select to improve readability, but in general I'm using lets say 5-20% of the full entities)
var projection = Repository.GetAll()
.AsNoTracking()
.Select(r => new
{
r.Id,
r.RandomId,
r.State,
r.RequestType,
r.CreatedDate,
r.CreatedBy,
Children = r.Children.Select(r2 => new
{
r2.Id,
r2.Status,
GrandChildren = r2.GrandChildren.Select(r3 => new
{
r3.Id,
r3.Status,
r3.GrandChildType
})
}),
}
).ToList();
This is obviously using anonymous types (I believe this is required in EF there is not a way to project into a named type?) (edit : apparently you can project into a non-mapped named type, but in this case, the return type of the query is a mapped type. So I could create a DTO, but that's even more code to maintain)
so then I have to get back into my concrete types. I could certainly generate DTOs that only had the properties needed, but I don't think that changes the fundamental logic used, nor probably the performance characteristics.
I tried my standbys of Automapper and ValueInjecter, but neither one seemed to fit the bill perfectly here (deep clone of heterogeneous types with matching names) so I went dirty
var json = projection.Select(JsonConvert.SerializeObject).ToList();
var mapped = json.Select(JsonConvert.DeserializeObject<Parent>).ToList();
This is somewhat lame since its just going to be serialized again as part of the rest call. There is probably a way I can override the webAPI calls to say I am returning the already serialized data, which would let me skip the rehydration into the entity type (as all of the property names match, the rest client should be able to rehydrate the anonymous type as if it were the real type, the same way the snippet above does)
BUT all this seems like a lot of work, less maintainable code, more possible places to have bugs, etc for a use case that entity framework really does not seem to want to support. But my old school instincts can't let go of the idea that I'm selecting, serializing, and transferring a whole lot of data that ultimately I'm not going to consume.
Does this produce sane SQL under the covers? Is that worth the double serialization? (assuming I don't figure out how to override webapi to let me hand it the data)
I suppose my other choice would be to refactor all the entities so that the unused properties are in different sub entities that I can just not include, but that would be a lot of rework throughout the system (versus being able to surgically improve performance at critical points) and it also seems like a poor choice to design entities around the ORM I happen to be using vs standard normalization rules etc.