0

We are having trouble with the way NHibernate (version 4.0.0.4000 AND 4.0.4.4000 tested) returns duplicate results. In the sample below, I get 566 results (the correct number of results), but only 549 are unique, meaning there are 17 duplicates.

#region Get Record IDs
public IList<string> GetRecordIds(string user, string agency, DateTime utcFrom, DateTime utcTo, SearchDateRangeType dateRangeType, IEnumerable<string> status, IEnumerable<string> billingStatus, IEnumerable<string> qaStatus, IEnumerable<string> transmissionStatus, IEnumerable<string> scheduledTransmissions, int pageSize = -1, int pageNumber = -1)
{
    using (ISession session = NHibernateHelper.OpenSession())
    {
        ICriteria crit = session.CreateCriteria<Metadata>();

        var dateDisjunction = Restrictions.Disjunction();

        dateDisjunction.Add(Restrictions.Between("IncidentDate", utcFrom, utcTo));

        crit.Add(dateDisjunction);

        if (string.IsNullOrEmpty(agency) == false)
        {
            crit.CreateAlias("Ownership._entities.AsIList", "entities");
            crit.Add(Restrictions.Eq("entities._entityName._value", agency));
            crit.Add(Restrictions.Eq("entities._isDeleted._value", false) || Restrictions.IsNull("entities._isDeleted._value"));
        }

        crit.AddOrder(Order.Asc(Projections.Property("RecordId")));

        crit.SetProjection(Projections.Property("RecordId"));

        if (pageSize > 0 && pageNumber > 0)
        {
            crit.SetFirstResult(pageSize * (pageNumber - 1)).SetMaxResults(pageSize);
        }

        var ret = crit.List<string>();

        return ret;
    }
}
#endregion

SQL Sample 1 is the generated first iteration code from NHibernate. Subsequent pages (second page onward) use ROW_NUMBER() OVER. SQL Sample 2 is a manually-created first page, which uses ROW_NUMBER() OVER as if it was a subsequent page. NHibernate has apparently "optimized" away the ROW_NUMBER() OVER for the first page and that seems(?) to be the cause of our issues.

SQL Sample 1: Generated by NHibernate. Causes duplicates.

SELECT
    TOP (100) this_.RecordId as y0_ 
FROM
    PcrMetadata this_ 
inner join
    PcrEntities entities1_ 
        on this_.Id=entities1_.ListKey 
WHERE
    (
        this_.IncidentDate between '0001-01-01 00:00:00.0000000' and '9999-01-01 00:00:00.0000000'
    ) 
    and entities1_.Name = 'ClientIDNumber' 
    and (
        entities1_.Entities_IsDeleted = 0 
        or entities1_.Entities_IsDeleted is null
    )

SQL Sample 2: Manually created based on NHibernate second page on. Does not cause duplicates.

SELECT
    TOP (100) this_.RecordId as y0_  
FROM
    (SELECT
        this_.Record as y0_,
        ROW_NUMBER() OVER(
    ORDER BY
        CURRENT_TIMESTAMP) as __hibernate_sort_row 
    FROM
        PcrMetadata this_ 
    inner join
        PcrEntities entities1_ 
            on this_.Id=entities1_.ListKey 
    WHERE
        (
            this_.IncidentDate between '0001-01-01 00:00:00.0000000' and '9999-01-01 00:00:00.0000000'
        ) 
        and entities1_.Name = 'ClientIDNumber' 
        and (
            entities1_.Entities_IsDeleted = 0 
            or entities1_.Entities_IsDeleted is null
        )) as query 
WHERE
    query.__hibernate_sort_row > 0 -- CHANGE THIS NUMBER

Am I doing something wrong? Or there anything I can do to force NHibernate to use ROW_NUMBER? Thanks in advance for any help!

Radim Köhler
  • 122,561
  • 47
  • 239
  • 335
dythim
  • 920
  • 7
  • 24

1 Answers1

1

We cannot JOIN collections and apply paging. Because we are getting cartesian product, which is paged (experience described above).

The solution I would suggest, is to (my way NEVER) join collection. To get the similar results, we should:

  • use subquery to apply WHERE
  • use fetch batching to later recieve all collection items without 1 + N issue

There is detailed answer about this issue.

see also:

There is more about making result distinct, but this could not help here:

Community
  • 1
  • 1
Radim Köhler
  • 122,561
  • 47
  • 239
  • 335
  • Thanks for your response! Just so I'm clear, is there an issue even when the Ownership will have only one entry returned from the query? In both sets of generated SQL, it is a particular name it searches for in Ownership, and in our application, that will be a single row. I've also verified manually with SQL. – dythim Sep 18 '15 at 04:17
  • In case, that there is exactly one (but you have to be sure that you can guarantee that even by some DB contsraint) - you can join collection and your paging will work. But my experience is - NEVER do that. it is by conception not correct. Yes, its up to you, but ... at least try to think about my suggstions. I showed you the way how to FITLER and even SELECT collections without 1 + N ... while still be able to page over multi-items-lists – Radim Köhler Sep 18 '15 at 05:14
  • Thanks, I do see how what you are saying can be an issue, but I'm concerned that it's not the actual issue I'm seeing. I can see the issue with SQL, independently of NHibernate. The two sql samples are page 0. NHibernate "optimized" their page 0 without `ROW_NUMBER` -- so the query result should be the same as the SQL query with `ROW_NUMBER` -- but it is not. Does my logic make sense? – dythim Sep 18 '15 at 06:12
  • I would say (sorry for that, I really do try to help) - you are experiencing WRONG results because the approach is wrong. And that won't stop until you will follow the better (maybe mine) solution. Because, the issue here is 99% not related to ROW_NUMBER but to fact, that first page takes only 1:1 collections but on the second is first root with more then 1:N... that is the max I can see or say... wish it helps a bit... – Radim Köhler Sep 18 '15 at 06:28
  • Thanks, I appreciate it. I will take some time to look for a solution with what you referenced. – dythim Sep 18 '15 at 06:50
  • I've realized that the way I phrased the question may have been misleading. The key problem is *not* that we are receiving duplicates. The problem is that we are not receiving results that we expect to receive, and are receiving duplicate entries *in place of* those results. – dythim Sep 18 '15 at 21:10