202

What is the best way to get the Max value from a LINQ query that may return no rows? If I just do

Dim x = (From y In context.MyTable _
         Where y.MyField = value _
         Select y.MyCounter).Max

I get an error when the query returns no rows. I could do

Dim x = (From y In context.MyTable _
         Where y.MyField = value _
         Select y.MyCounter _
         Order By MyCounter Descending).FirstOrDefault

but that feels a little obtuse for such a simple request. Am I missing a better way to do it?

UPDATE: Here's the back story: I'm trying to retrieve the next eligibility counter from a child table (legacy system, don't get me started...). The first eligibility row for each patient is always 1, the second is 2, etc. (obviously this is not the primary key of the child table). So, I'm selecting the max existing counter value for a patient, and then adding 1 to it to create a new row. When there are no existing child values, I need the query to return 0 (so adding 1 will give me a counter value of 1). Note that I don't want to rely on the raw count of child rows, in case the legacy app introduces gaps in the counter values (possible). My bad for trying to make the question too generic.

wingerse
  • 3,670
  • 1
  • 29
  • 61
gfrizzle
  • 12,419
  • 19
  • 78
  • 104

17 Answers17

218

Since DefaultIfEmpty isn't implemented in LINQ to SQL, I did a search on the error it returned and found a fascinating article that deals with null sets in aggregate functions. To summarize what I found, you can get around this limitation by casting to a nullable within your select. My VB is a little rusty, but I think it'd go something like this:

Dim x = (From y In context.MyTable _
         Where y.MyField = value _
         Select CType(y.MyCounter, Integer?)).Max

Or in C#:

var x = (from y in context.MyTable
         where y.MyField == value
         select (int?)y.MyCounter).Max();
Jacob Proffitt
  • 12,664
  • 3
  • 41
  • 47
  • 1
    To correct the VB, the Select would be "Select CType(y.MyCounter, Integer?)". I have to do an original check to convert Nothing to 0 for my purposes, but I like getting the results without an exception. – gfrizzle Dec 05 '08 at 13:51
  • 3
    One of the two overloads of DefaultIfEmpty is supported in LINQ to SQL - the one that doesn't take parameters. – DamienG Dec 05 '08 at 17:46
  • Possibly this information is out of date, as I just successfully tested both forms of DefaultIfEmpty in LINQ to SQL – Neil Dec 21 '11 at 10:08
  • 6
    @Neil: please make an answer. DefaultIfEmpty doesn't work for me: I want the `Max` of a `DateTime`. `Max(x => (DateTime?)x.TimeStamp)` still the only way.. – duedl0r Jan 20 '12 at 17:59
  • 1
    Although DefaultIfEmpty is now implemented in LINQ to SQL, this answer remains better IMO, as using DefaultIfEmpty results in a SQL statement 'SELECT MyCounter' that returns a _row for every value being summed_, whereas this answer results in MAX(MyCounter) that returns a single, summed row. (Tested in EntityFrameworkCore 2.1.3.) – Carl Sharman Sep 06 '19 at 09:24
126

I just had a similar problem, but I was using LINQ extension methods on a list rather than query syntax. The casting to a Nullable trick works there as well:

int max = list.Max(i => (int?)i.MyCounter) ?? 0;
Eddie Deyo
  • 5,200
  • 8
  • 35
  • 35
55

Sounds like a case for DefaultIfEmpty (untested code follows):

Dim x = (From y In context.MyTable _
         Where y.MyField = value _
         Select y.MyCounter).DefaultIfEmpty.Max
BartoszKP
  • 34,786
  • 15
  • 102
  • 130
Jacob Proffitt
  • 12,664
  • 3
  • 41
  • 47
  • I'm not familiar with DefaultIfEmpty, but I get "Could not format node 'OptionalValue' for execution as SQL" when using the syntax above. I also tried providing a default value (zero), but it didn't like that either. – gfrizzle Dec 04 '08 at 19:28
  • Ah. Looks like DefaultIfEmpty isn't supported in LINQ to SQL. You could get around that by casting to a list first with .ToList but that's a significant performance hit. – Jacob Proffitt Dec 04 '08 at 21:52
  • 3
    Thanks, this is exactly what I was looking for. Using extension methods: `var colCount = RowsEnumerable.Select(row => row.Cols.Count).DefaultIfEmpty().Max()` – Jani May 13 '15 at 12:03
37

Think about what you're asking!

The max of {1, 2, 3, -1, -2, -3} is obviously 3. The max of {2} is obviously 2. But what is the max of the empty set { }? Obviously that is a meaningless question. The max of the empty set is simply not defined. Attempting to get an answer is a mathematical error. The max of any set must itself be an element in that set. The empty set has no elements, so claiming that some particular number is the max of that set without being in that set is a mathematical contradiction.

Just as it is correct behavior for the computer to throw an exception when the programmer asks it to divide by zero, so it is correct behavior for the computer to throw an exception when the programmer asks it to take the max of the empty set. Division by zero, taking the max of the empty set, wiggering the spacklerorke, and riding the flying unicorn to Neverland are all meaningless, impossible, undefined.

Now, what is it that you actually want to do?

yfeldblum
  • 65,165
  • 12
  • 129
  • 169
  • Good point - I'll update my question shortly with those details. Suffice to say that I know I want 0 when there are no records to select from, which definitely has an impact on the eventual solution. – gfrizzle Dec 04 '08 at 23:36
  • 18
    I frequently attempt to fly my unicorn to Neverland, and I take offense to your suggestion that my efforts are meaningless and undefined. – Chris Shouts Nov 18 '11 at 14:07
  • 2
    I don't think this argumentation is right. It's cleary linq-to-sql, and in sql Max over zero rows is defined as null, no? – duedl0r Jan 20 '12 at 17:52
  • 4
    Linq should generally produce identical results whether the query is executed in-memory on objects or whether the query is executed at the database on rows. Linq queries are Linq queries, and should be executed faithfully regardless of which adapter is in use. – yfeldblum Jan 20 '12 at 18:01
  • 1
    While I agree in theory that Linq results should be identical whether executed in memory or in sql, when you actually dig a little deeper, you discover why this cannot always be so. Linq expressions are translated into sql using complex expression translation. It is not a simple one-to-one translation. One difference is the case of null. In C# "null == null" is true. In SQL, "null == null" matches are included for outer joins but not for inner joins. However, inner joins are almost always what you want so they are the default. This causes possible differences in behavior. – Curtis Yallop Dec 31 '14 at 17:13
  • Max({}) returning null could be a reasonable behavior but if the return type of Max is supposed to be a non-nullable type, it cannot return null. It would break C# type-safety. Unless you made Max(List) always return "int?" (nullable int). But that would not always be desirable. – Curtis Yallop Dec 31 '14 at 17:27
  • The max value of an empty set is clearly nothing, correct? What represents "nothing" in numbers? Zero. – Kyle Jul 15 '15 at 15:45
  • @Kyle - That's absolutely false. To put the question in simpler English: "which of these things is biggest?" If there are no things, there is none of them that is biggest, and the question cannot be answered as asked. We don't come up with a medium-sized thing in such cases and point to it and say "this thing is biggest." No. We just say "there are no things so the question cannot be answered." – yfeldblum Jul 16 '15 at 20:24
26

You could always add Double.MinValue to the sequence. This would ensure that there is at least one element and Max would return it only if it is actually the minimum. To determine which option is more efficient (Concat, FirstOrDefault or Take(1)), you should perform adequate benchmarking.

double x = context.MyTable
    .Where(y => y.MyField == value)
    .Select(y => y.MyCounter)
    .Concat(new double[]{Double.MinValue})
    .Max();
David Schmitt
  • 58,259
  • 26
  • 121
  • 165
11
int max = list.Any() ? list.Max(i => i.MyCounter) : 0;

If the list has any elements (ie. not empty), it will take the max of the MyCounter field, else will return 0.

beastieboy
  • 833
  • 8
  • 15
11

Since .Net 3.5 you can use DefaultIfEmpty() passing the default value as an argument. Something like one of the followings ways:

int max = (from e in context.Table where e.Year == year select e.RecordNumber).DefaultIfEmpty(0).Max();
DateTime maxDate = (from e in context.Table where e.Year == year select e.StartDate ?? DateTime.MinValue).DefaultIfEmpty(DateTime.MinValue).Max();

The first one is allowed when you query a NOT NULL column and the second one is the way a used it to query a NULLABLE column. If you use DefaultIfEmpty() without arguments the default value will be that defined to the type of your output, as you can see in the Default Values Table .

The resulting SELECT will not be so elegant but it's acceptable.

Hope it helps.

7

I think the issue is what do you want to happen when the query has no results. If this is an exceptional case then I would wrap the query in a try/catch block and handle the exception that the standard query generates. If it's ok to have the query return no results, then you need to figure out what you want the result to be in that case. It may be that @David's answer (or something similar will work). That is, if the MAX will always be positive, then it may be enough to insert a known "bad" value into the list that will only be selected if there are no results. Generally, I would expect a query that is retrieving a maximum to have some data to work on and I would go the try/catch route as otherwise you are always forced to check if the value you obtained is correct or not. I'd rather that the non-exceptional case was just able to use the obtained value.

Try
   Dim x = (From y In context.MyTable _
            Where y.MyField = value _
            Select y.MyCounter).Max
   ... continue working with x ...
Catch ex As SqlException
       ... do error processing ...
End Try
tvanfosson
  • 524,688
  • 99
  • 697
  • 795
  • In my case, returning no rows happens more frequently than not (legacy system, the patient may or may not have previous eligibility, blah blah blah). If this were a more exceptional case, I'd probably go this route though (and I may still, not seeing much better). – gfrizzle Dec 04 '08 at 19:32
6

litt late, but I had the same concern...

Rephrasing your code from the original post, you want the max of the set S defined by

(From y In context.MyTable _
 Where y.MyField = value _
 Select y.MyCounter)

Taking in account your last comment

Suffice to say that I know I want 0 when there are no records to select from, which definitely has an impact on the eventual solution

I can rephrase your problem as: You want the max of {0 + S}. And it looks like the proposed solution with concat is semantically the right one :-)

var max = new[]{0}
          .Concat((From y In context.MyTable _
                   Where y.MyField = value _
                   Select y.MyCounter))
          .Max();
Dom Ribaut
  • 198
  • 2
  • 8
6

Another possibility would be grouping, similar to how you might approach it in raw SQL:

from y in context.MyTable
group y.MyCounter by y.MyField into GrpByMyField
where GrpByMyField.Key == value
select GrpByMyField.Max()

The only thing is (testing again in LINQPad) switching to the VB LINQ flavor gives syntax errors on the grouping clause. I'm sure the conceptual equivalent is easy enough to find, I just don't know how to reflect it in VB.

The generated SQL would be something along the lines of:

SELECT [t1].[MaxValue]
FROM (
    SELECT MAX([t0].[MyCounter) AS [MaxValue], [t0].[MyField]
    FROM [MyTable] AS [t0]
    GROUP BY [t0].[MyField]
    ) AS [t1]
WHERE [t1].[MyField] = @p0

The nested SELECT looks icky, like the query execution would retrieve all rows then select the matching one from the retrieved set... the question is whether or not SQL Server optimizes the query into something comparable to applying the where clause to the inner SELECT. I'm looking into that now...

I'm not well-versed in interpreting execution plans in SQL Server, but it looks like when the WHERE clause is on the outer SELECT, the number of actual rows resulting in that step is all rows in the table, versus only the matching rows when the WHERE clause is on the inner SELECT. That said, it looks like only 1% cost is shifted to the following step when all rows are considered, and either way only one row ever comes back from the SQL Server so maybe it's not that big of a difference in the grand scheme of things.

Rex Miller
  • 2,706
  • 1
  • 19
  • 26
3

Why Not something more direct like:

Dim x = context.MyTable.Max(Function(DataItem) DataItem.MyField = Value)
legal
  • 31
  • 1
2

I've knocked up a MaxOrDefault extension method. There's not much to it but its presence in Intellisense is a useful reminder that Max on an empty sequence will cause an exception. Additionally, the method allows the default to be specified if required.

    public static TResult MaxOrDefault<TSource, TResult>(this 
    IQueryable<TSource> source, Expression<Func<TSource, TResult?>> selector,
    TResult defaultValue = default (TResult)) where TResult : struct
    {
        return source.Max(selector) ?? defaultValue;
    }
Stephen Kennedy
  • 20,585
  • 22
  • 95
  • 108
1

Just to let everyone out there know that is using Linq to Entities the methods above will not work...

If you try to do something like

var max = new[]{0}
      .Concat((From y In context.MyTable _
               Where y.MyField = value _
               Select y.MyCounter))
      .Max();

It will throw an exception:

System.NotSupportedException: The LINQ expression node type 'NewArrayInit' is not supported in LINQ to Entities..

I would suggest just doing

(From y In context.MyTable _
                   Where y.MyField = value _
                   Select y.MyCounter))
          .OrderByDescending(x=>x).FirstOrDefault());

And the FirstOrDefault will return 0 if your list is empty.

Ahmad Mageed
  • 94,561
  • 19
  • 163
  • 174
Nix
  • 57,072
  • 29
  • 149
  • 198
  • Ordering can result in a serious performance degradation with big datasets. It is a very inefficient way to find a max value. – Peter Bruins Nov 01 '18 at 16:39
1

One interesting difference that seems worth noting is that while FirstOrDefault and Take(1) generate the same SQL (according to LINQPad, anyway), FirstOrDefault returns a value--the default--when there are no matching rows and Take(1) returns no results... at least in LINQPad.

Rex Miller
  • 2,706
  • 1
  • 19
  • 26
1

For Entity Framework and Linq to SQL we can achieve this by defining an extension method which modifies an Expression passed to IQueryable<T>.Max(...) method:

static class Extensions
{
    public static TResult MaxOrDefault<T, TResult>(this IQueryable<T> source, 
                                                   Expression<Func<T, TResult>> selector)
        where TResult : struct
    {
        UnaryExpression castedBody = Expression.Convert(selector.Body, typeof(TResult?));
        Expression<Func<T, TResult?>> lambda = Expression.Lambda<Func<T,TResult?>>(castedBody, selector.Parameters);
        return source.Max(lambda) ?? default(TResult);
    }
}

Usage:

int maxId = dbContextInstance.Employees.MaxOrDefault(employee => employee.Id);
// maxId is equal to 0 if there is no records in Employees table

The generated query is identical, it works just like a normal call to IQueryable<T>.Max(...) method, but if there is no records it returns a default value of type T instead of throwing an exeption

1
decimal Max = (decimal?)(context.MyTable.Select(e => e.MyCounter).Max()) ?? 0;
Toon Krijthe
  • 52,876
  • 38
  • 145
  • 202
jong su.
  • 11
  • 1
-1

I just had a similar problem, my unit tests passed using Max() but failed when run against a live database.

My solution was to separate the query from the logic being performed, not join them in one query.
I needed a solution to work in unit tests using Linq-objects (in Linq-objects Max() works with nulls) and Linq-sql when executing in a live environment.

(I mock the Select() in my tests)

var requiredDataQuery = _dataRepo.Select(x => new { x.NullableDate1, .NullableDate2 }); 
var requiredData.ToList();
var maxDate1 = dates.Max(x => x.NullableDate1);
var maxDate2 = dates.Max(x => x.NullableDate2);

Less efficient? Probably.

Do I care, as long as my app doesn't fall over next time? Nope.

Seb
  • 179
  • 2
  • 5