1

I have a SQL CLR aggregate function which sporadically throws a System.NullReferenceException exception when executed against the same dataset. The purpose of the custom aggregate is to:

Return latest(x, y) where x is a DATETIME column and y is an INTEGER column.

The value for column y for the most recent value of column x will be returned.

The dataset being hit by the query is a subset of 142,145 rows of a 2,931,563 row table, with the aggregation resulting in (when it runs) 141,654 rows being returned.

The code for the CLR aggregate function is as follows:

using System.Data.SqlTypes;
using System.Runtime.InteropServices;
using Microsoft.SqlServer.Server;

[StructLayout(LayoutKind.Sequential)]
[SqlUserDefinedAggregate(Format.Native, IsInvariantToDuplicates = true, IsInvariantToNulls = true, IsInvariantToOrder = true, IsNullIfEmpty = true, Name = "Latest")]
public class Latest
{
    private SqlDateTime latestDateTime;
    private SqlInt32 latestValue;

    public void Init()
    {
        latestDateTime = SqlDateTime.Null;
        latestValue = SqlInt32.Null;
    }

    public void Accumulate(SqlDateTime recordDateTime, SqlInt32 value)
    {
        if (latestDateTime.IsNull)
        {
            latestDateTime = recordDateTime;
            latestValue = value;
        }
        else
        {
            if (recordDateTime > latestDateTime)
            {
                latestDateTime = recordDateTime;
                latestValue = value;
            }
        }
    }

    public void Merge(Latest value)
    {
        if ((value.latestDateTime < latestDateTime) || (latestDateTime.IsNull))
        {
            latestValue = value.latestValue;
            latestDateTime = value.latestDateTime;
        }
    }

    public SqlInt32 Terminate()
    {
        return latestValue;
    }
};

As far as I can tell there's nowhere in the function that can result in a null reference, assuming that SQL server is following the contract outlined on MSDN (though it's much more likely I'm wrong than SQL Server!). So in short, what am I missing here?

To clarify:

  • I believe I've met the requirements of the contract for a SqlUserDefinedAggregate (all required methods implemented)
  • The code initialises all member variables in the Init method (again part of the contract implementation) to ensure that if SQL re-uses (as it's permitted to) an instance of the aggregate for a different group it's cleaned down and non-null
  • Clearly I've missed a nuance of the contract that I'm expected to meet as I can see no reason for the NullReferenceException to be thrown. What have I missed?
Rob
  • 45,296
  • 24
  • 122
  • 150
  • This is a partial class, there other code ? There are null values on x column for the subset ? – Max Aug 05 '15 at 13:24
  • No idea why the `partial` is there as there's only the one file that defines this class. Though it'll make both our comments seem a bit odd, I'll remove the `partial` from the code to aid clarity :) – Rob Aug 05 '15 at 13:25
  • The only even remotely possible suspect I see is value.latestDateTime (in the Merge void). If it's possible for value to be null, I think it could trigger the null reference error. – Tab Alleman Aug 05 '15 at 13:26
  • Do you necessarily have to use a CLR aggregate? How about using a windowing function, like FIRST_VALUE(y) OVER (ORDER BY x DESC) if you're on SQL Server 2012. If not 2012 you could still use ROW_NUMBER() to get to a similar solution – David Ferretti Aug 05 '15 at 13:43
  • Please do enough debugging that your question can be more specific than "why does this throw NullReferenceException?" If and when you do post a new, better question, be sure it includes [a good, _minimal_, _complete_ code example](https://stackoverflow.com/help/mcve) that reliably reproduces the problem. – Peter Duniho Aug 06 '15 at 00:06
  • @PeterDuniho, did you actually read my question in its entirety before marking it as a duplicate of the incredibly generic "I done got a NullReferenceException" question? As stated, as far as I can tell (_which is the crux of the matter as I must've missed something_), I've met all the requirements that SQL CLR places on me (primarily initialising all member variables in `Init` to allow for instance re-use). This *is* the simplest code that reproduces the issue. I appreciate stackoverflow gets littered with "I done got a NullReferenceException, halp!" questions, but that doesn't mean that... – Rob Aug 06 '15 at 07:07
  • ... every single one is worthy of being marked as a duplicate of "What is a NullReferenceException and how do I fix it?" – Rob Aug 06 '15 at 07:07
  • @David, in a simple case the use of partition functions (we're on SQL Server 2008 R2/SP3 and alas upgrading isn't an option at present) returns both the correct results and satisfactory performance, however for more complex (i.e. production) queries, the performance tails off as compared to the CLR. In an ideal world, I'd not be using a CLR, but this isn't an ideal world :( – Rob Aug 06 '15 at 07:13
  • @TabAlleman, Theoretically `value` should never be null as it'll be an initialised instance of the aggregate that's being passed in. That said, it could be that `value.latestDateTime.IsNull` is true (remember, the `latestDateTime` member is a `SqlDateTime` which is sort of like `DateTime?` and that has a weird interaction with `<` but that would be surprising (I'm preparing to be surprised!). I'll add a few unit tests to see how that pans out later today. – Rob Aug 06 '15 at 07:17
  • As a Java programmer, this seems odd `((value.latestDateTime < latestDateTime) || (latestDateTime.IsNull))`. Shouldn't it be better `(latestDateTime.IsNull)) || ((value.latestDateTime < latestDateTime)` – SJuan76 Aug 06 '15 at 07:17
  • Rob, were you able to try out anything that I (or anyone else answering here) suggested? – Solomon Rutzky Aug 12 '15 at 20:45

3 Answers3

1

This may turn out not to be the answer, but as I need to include code, I won't post it as a comment.

Having a NULL value for value.latestDateTime wouldn't cause this error. You only get the NULLReferenceException when an OBJECT is null, and you try to refer to it (by accessing it's properties for instance).

The only place in your code where I see you referencing an object is in the Merge void. The object is value which is of type Latest.

I know you say it's not theoretically possible for value to ever be null, but that can't be proven or disproven in the code you've posted since Merge is public and therefore accessed from other applications.

I have found in OOP that it's safest to ALWAYS null-check any object before referencing it. All you have to do is change your Merge code to this (if my c# memory serves...if my syntax is off, I'm sure you'll get the idea):

public void Merge(Latest value)
{
    if (value != null)
    {
        if ((value.latestDateTime < latestDateTime) || (latestDateTime.IsNull))
        {
            latestValue = value.latestValue;
            latestDateTime = value.latestDateTime;
        }
    }
}

It's up to you if you want to do something else when value IS null. But this makes this stretch of code absolutely safe from NullReferenceExceptions, rather than trusting to "theoretically shouldn't be possible", which almost always means "IS possible" ; )

Tab Alleman
  • 31,483
  • 7
  • 36
  • 52
  • The parameter coming into the `Merge` won't ever be `null`, else there was nothing to merge. I have never done a NULL check in the Merge and have never gotten an error or even heard of such a thing happening. – Solomon Rutzky Aug 06 '15 at 17:32
1

Everything does appear to be correct. Except for possibly one thing. Can either of the source columns for the two input parameters be NULL? The code itself does seem to handle NULLs. However, if there are NULLs in the data then I think you have the IsInvariantToNulls property of the SqlUserDefinedAggregate attribute set incorrectly as it should be false since a NULL in either field could affect the result.

Also:

  • The datatype of the field being used for recordDateTime is DATETIME and not DATETIME2, correct?

  • Try running the query after adding OPTION(MAXDOP 1) as a query hint as that should prevent the Merge method from being called and that can help narrow down the issue. If the exception never happens with OPTION(MAXDOP 1) then it most likely has to do with the Merge method. But if it still occurs, then it most likely has nothing to do with the Merge method.

  • To clarify some questioning of how SqlDateTime handles comparison operators (i.e. the < and >) when .IsNull is true: it is handled properly and will return a false if either side of those operators has .IsNull == true.

  • I would also remove the [StructLayout(LayoutKind.Sequential)] decorator. The default .NET handling should be fine.

Solomon Rutzky
  • 46,688
  • 9
  • 128
  • 171
0

I would add explicit check for NULL:

public void Accumulate(SqlDateTime recordDateTime, SqlInt32 value)
{
    if (latestDateTime.IsNull)
    {
        latestDateTime = recordDateTime;
        latestValue = value;
    }
    else
    {
        if (!recordDateTime.IsNull)
        {
            if (recordDateTime > latestDateTime)
            {
                latestDateTime = recordDateTime;
                latestValue = value;
            }
        }
    }
}

Besides, your logic in Merge seems wrong... I'd repeat the same code as in Accumulate:

public void Merge(Latest value)
{
    if (latestDateTime.IsNull)
    {
        latestDateTime = value.latestDateTime;
        latestValue = value.latestValue;
    }
    else
    {
        if (!value.latestDateTime.IsNull)
        {
            if (value.latestDateTime > latestDateTime)
            {
                latestDateTime = value.latestDateTime;
                latestValue = value.latestValue;
            }
        }
    }
}
Vladimir Baranov
  • 31,799
  • 5
  • 53
  • 90