Good behaviour for eager loading multiple siblings and grandchildren (cousins?) in NHibernate 3.0 Linq

Question

I'm trying to do the following with NHibernate 3.0's LINQ interface. I want to query for an object (using some Where clause), and load some children and grandchildren. Currently I'm doing it like so:

var results = session.Query<Thing>()
                     .Where(...)
                     .Fetch(x => x.SubThingA)
                     .ThenFetch(st => st.SubSubThingA)

                     .Fetch(x => x.SubThingB)
                     .ThenFetch(st => st.SubSubThingB)

                     // etc...

However, this results in a Cartesian product between all grandchildren (every result row contains many, many columns). This is discussed by "ayende" here. On the other hand I get a single round-trip, unlike splitting the query and then combining it.

How can I do it in a better (SQL and performance-wise) way, still using NHibernate's LINQ interface?

(For one thing, I've noticed that currently the ToFuture methods don't work when you use Fetch)

Thanks a lot!

A related question: is there a way using LINQ to perform multiple queries in one round-trip? (given that Future doesn't seem to work on LINQ queries that use Fetch) — sinelaw, Feb 28 '11 at 15:55
How many children are we talking about here? Sending a complete object graph to a remote client (I'm doing that, too) is more costly performance-wise than hitting the database two or three times instead of once. In such a case I usually split the Query. Also, you should consider if you need the whole object graph right from the start or if you could also load some other children dynamically when needed. — Florian Lim, Feb 28 '11 at 16:30
Florian Lim, the object has many, many children. I also would rather hit the DB than the client, but I'm trying to avoid that and perform the whole thing in a single query. — sinelaw, Feb 28 '11 at 16:31
Afaik it cannnot be done with NHibernate Linq 3.0 but I will watch this question and if you get a working answer I'll gladly use that. (+1) — Florian Lim, Feb 28 '11 at 16:42

Diego Mijelshon · Answer 1 · 2011-03-01T11:40:48.007

2

In most cases, you'll get better performance by using batch-size in entities and collections instead of creating a mega-query.

Best case scenario, it's a query by id per root entity type.

Let's say you have a root entity Customer, which has a collection of Orders, which have a collection of OrderItems, which reference Products, and all batch-size properties are set to 1000.

Say you retrieve a list of 10 customers, which have in average 10 orders with 10 products each:

var results = session.Query<Customer>().Where(...).Take(10).ToList();

The first query will fetch just the customers.
When you start iterating the first customer.Orders collection, one query will be used to load all of them (for all the customers)
When you start iterating the first order.OrderItems collection, one query will be used to load all of them (for all the orders and all the customers)
When you read a property from the first product, one query will be used to load all of them

So, you have just 4 queries, with no joins at all, retrieving everything by PK. It's easy and efficient.

edited Mar 01 '11 at 11:40

answered Feb 28 '11 at 16:00

Diego Mijelshon

52,548
16
116
154

Thanks for the reply. My situation is such that I need to load ALL the required object graph (it is being sent to a remote client). From what I gather batch-size is good when you maintain an open session and are using the lazy interface. What's the best solution when the a big graph is required from the start? – sinelaw Feb 28 '11 at 16:04
@sinelaw: From your comment, I understand you're trying to serialize an object graph "as-is". Don't. You're just artificially limiting yourself. – Diego Mijelshon Feb 28 '11 at 17:15
Diego - what do you mean by "artificially limiting yourself"? Can you explain what problem you're referring to? – sinelaw Feb 28 '11 at 17:22
1

@sinelaw: if you serialized the graph using an explicit contract, you'd be able to let NH work its magic. – Diego Mijelshon Feb 28 '11 at 17:36
Diego - ok, got it, sounds like a better idea indeed. – sinelaw Feb 28 '11 at 17:40
Deigo - what about the grandchild nodes? If I leave their fetching to laziness, even though batch-size will help me to iterate the child nodes with less DB round-trips, I'd still require at least one round-trip per child node to fetch grandchild collections. What would you do in this case? (I know for sure the grandchild collections will need to be accessed for every child node) – sinelaw Mar 01 '11 at 08:12

score 1 · Accepted Answer · answered Jun 28 '11 at 06:53

Although Diego's answer is the accepted method for doing these things in NHibernate, I really don't feel comfortable with that approach. I don't want to have to define explicit contracts for my objects just because I may need to retrieve them in certain ways. Also, I don't always want to serialize them at all. Furthermore, in many cases I know that the best performance will always be a single round-trip to fetch all the data.

The solution I ended up using was to implement a function that takes a list of (type safe) expressions on the root object, such as

x => x.Child.GrandChild1
x => x.Child.GrandChild2Collection.SubInclude(c => c.GreatGrandChild)

Where SubInclude is an extension method for IEnumerable that is used when parsing these expressions.

I parse this list of expressions and build, for each subpath of each expression (x, x.Child, x.Child.GrandChild1) an NHibernate criteria query on the root type:

var queryOver = session.QueryOver<T>().Where( ...expression to select root objects... );
for every subpath in the current expression:
    queryOver.RootCriteria.SetFetchMode(subPath, FetchMode.Eager)

queryOver.RootCriteria
         .SetResultTransformer(new DistinctRootEntityResultTransformer())

queryOver.Future()

This is repeated for every expression in the list. The last line makes sure that this eager fetch will be included in whatever round-trip happens next. Then I make the actual query on the root object T, and the session automatically performs in that same round-trip all the queries required to fetch each of the paths I've passed in the expressions.

The queries are performed separately for each expression path, so there is no cartesian-product problem.

Bottom line is that this is no simple feat. There's a little too much code for me to publish as-is. I prefer EF4.1's Include(expression) API, which does all of this automagically.

**Important**: all my collections in my entity classes are defined as Sets. Otherwise, NHibernate ends up returning duplicates of grandchildren and deeper nodes. — sinelaw, Jun 28 '11 at 06:56

Good behaviour for eager loading multiple siblings and grandchildren (cousins?) in NHibernate 3.0 Linq

2 Answers2

Linked