This question is sort of a sequel to that question.
When we want to build a WCF service which works with some kind of data, it's natural that we want it to be fast and efficient. In order to achieve that, we have to make sure all segments of data road trip work as fast as they could, from data storage back-end such as SQL Server, to a WCF client who requested that data.
While seeking for an answer on that previous question, we have learned, thanks to Slauma and others who contributed through comments, that the time consuming part of Entity Framework's (first) large query is object materialization and attaching entities to the context when the result from the database is returned. We have seen that everything works much faster on subsequent queries.
Assuming those large queries are used as read-only operations, we came to a conclusion that we could set EF MergeOption
to NoTracking
, yielding better first query performance. What we have done with NoTracking
was telling EF to create separate object for each record retrieved from the database - even when they have the same key. This will cause additional processing if we have .Include()
statement in our query, which will lead to data with much larger size being returned.
The data may be so big that we could easily ask ourselves - did we really help our cause by using NoTracking
option, even if we made the query faster (and maybe only the first one, depending on the number of .Include()
statements, because subsequent queries without NoTracking
option with multiple .Include()
statements run faster simply because NoTracking
option causes a lot more objects to be created when data returns from the server)?
The biggest problem is how to efficiently serialize this amount of data - and deserialize it on the client. With serialization already as slow as it is (I am using DataContractSerializer
with PreserveObjectReferences
set to true
because I am sending EF 4.x generated POCOs to my client and vice versa), do we want to generate even more data (thanks to NoTracking
)? To be honest, I haven't seen the data originated from the query with NoTracking
option on ~11.000 objects not including navigation properties obtained via .Include()
, arriving at the client side yet. Last time I tried to pull this off, the timeout of 00:10:00 was triggered (!)
So if you are still reading this wall of text, you tell me how to solve this situation. Which serializer to use in order to achieve acceptable results? Currently, if I don't use the NoTracking
option, the serialization, transport and deserialization of ~11.000, via wsHttpBinding
-like custom binding on the local machine take ~5 seconds. What's scary to me is that this large table is most likely going to contain ~500.000 records eventually.