4

Given an object graph like:

A { IEnum<B> }
B { IEnum<C>, IEnum<D>, IEnum<E>, ... }
C { IEnum<X> }

How can I eagerly load the entire object graph without N+1 issues?

Here is the pseudo code for the queries that I would ultimately like to execute:

var a = Session.Get<A>(1); // Query 1
var b_Ids = foreach(b in A.B's) => Select(b.Id); // Query 2
var c = Session.CreateQuery("from C where B in (b_Ids)").Future<C>(); // Query 3
var d = Session.CreateQuery("from D where B in (b_Ids)").Future<D>(); // Query 3
var e = Session.CreateQuery("from E where B in (b_Ids)").Future<E>(); // Query 3

// Iterate through c, d, e, ... find the correct 'B' parent, add to collection manually

The problem that I have with this approach is that when I go to add the instances of 'C', 'D', and 'E' to the corresponding collection of the parent 'B', the collection is still proxied, and when .Add() is called, the proxy initializes itself and executes more queries; I think NHibernate is not capable of seeing that I already have all of the data in first level cache, which is understandable.

I've tried to work around this problem by doing something like this in my Add method:

void Add(IEnum<C>)
{
    _collection = new Collection<C>(); // replace the proxied instance to prevent initialization
    foreach(c) => _collection.Add(c);
}

This gave me the optimum query strategy that I wanted, but caught up with me later when doing persistence (NHibernate tracks the original collection by-ref somewhere from what I can tell).

So my question is, how can I load a complex graph with children of children without N+1? The only thing I've come across to date is joining B-C, B-D, B-E which is not acceptable in my situation.

We are using NH 2.1.2 with FluentHN for mapping. An upgrade to v3 of NH or using hbm's/stored procs/whatever would not be off the table.

UPDATE: One of the comments references a join approach, and I did come across a blog that demonstrates this approach. This work around is not acceptable in our situation, but it may help someone else: Eager fetch multiple child collections in 1 round trip with NHibernate

UPDATE 2: Jordan's answer led me to the following posts that are related to my question: Similar Question and Ayende's blog. The pending question at this point is "how can you perform the subselects without a round trip per-path".

UPDATE 3: I've accepted Jordan's answer even though the subselect solution is not optimal.

Community
  • 1
  • 1
Pat
  • 43
  • 4

2 Answers2

2

You can use SubSelect fetching which can be setup in the mapping files. This will avoid N+1 and cartesian product.

Jordan
  • 346
  • 4
  • 3
  • I thought that this would work for collections at a depth of 1, and cause N+1 for collections at a depth of 2, but that doesn't seem to be the case. From a test app it looks like you just end up with 2 levels of subselects. I cant imagine the subselects would be that expensive, and they should take advantage of the principle of locality... interesting. The number of round trips is still an issue - can you think of a way to batch all of the subselects together into a round trip? – Pat May 22 '11 at 17:49
  • I'm using SQLite, so batching isn't supported. I'd imagine you could use ADO.NET batching that NHibernate uses for this scenario. – Jordan May 23 '11 at 13:43
  • I've come to the conclusion that this is not possible with the current NH (without modifying the source)... This question is really the same question that I have now: http://stackoverflow.com/questions/5262103/nhibernate-how-to-perform-eager-subselect-fetching-of-many-children-grandchild - I would up vote your answer but I'm not registered :( – Pat May 24 '11 at 15:02
  • No problem. If you remember when you register I appreciate it. Good luck! – Jordan May 24 '11 at 19:13
0

firstly- you can change your mappings to load these collections eagerly. see item #4 in this section.
secondly- I believe that the reason that your collection seems to be loading twice is that you first fetch it using a query, and then using the collection property.
meaning- nHibernate distinguishes between queries generated by the user (like the one you use) and queries it generates itself (like the one that occurs when you first read your 'C' collection). they do not mix.
so, when you first read your 'C' collection, nHib does not recognize that it actually once sent the exact same query to the DB (since it was a user query), and sends it again.
The way to avoid this is to retrieve your C collection via your B entity.

J. Ed
  • 6,692
  • 4
  • 39
  • 55
  • I think you missed the part where I said "without N+1 issues" which is the real problem here. When you eager load an object model like this, you suffer from N+1 queries. In my example, for every "B", you will have a unique query for "C", "D", and "E" (10 B's = 10 C + 10 D + 10 E queries = 30 queries!). – Pat May 21 '11 at 20:43
  • @Pat: you can use the "fetch-type=join" option to load your Bs / Cs /Ds in the same query as your As – J. Ed May 22 '11 at 07:37
  • Also mentioned in the question: "The only thing I've come across to date is joining B-C, B-D, B-E which is not acceptable in my situation." By default, NH will create a cartesian product across all children which is terrible. There are ways with the criteria api to join per child independently, but because these child tables are 100mil + records, our DBA will not accept the joins. Also, B, C, and D are meant to simply the problem; we actually have 11 child collections at that level. – Pat May 22 '11 at 15:25