16

Anybody know how to write a LINQ to SQL statement to return every nth row from a table? I'm needing to get the title of the item at the top of each page in a paged data grid back for fast user scanning. So if i wanted the first record, then every 3rd one after that, from the following names:

Amy, Eric, Jason, Joe, John, Josh, Maribel, Paul, Steve, Tom

I'd get Amy, Joe, Maribel, and Tom.

I suspect this can be done... LINQ to SQL statements already invoke the ROW_NUMBER() SQL function in conjunction with sorting and paging. I just don't know how to get back every nth item. The SQL Statement would be something like WHERE ROW_NUMBER MOD 3 = 0, but I don't know the LINQ statement to use to get the right SQL.

p.campbell
  • 98,673
  • 67
  • 256
  • 322
bopapa_1979
  • 8,949
  • 10
  • 51
  • 76
  • I don't know if this could be done with generated LINQ to SQL sql – hunter Nov 17 '10 at 19:50
  • There really (No, REALLY) doesn't seem to be a way to arrive at the accepted answer's basic solution using pure LINQ to SQL. I have used a couple of alternatives. 1) Create a view that materializes the original table plus the row number using the desired sort, then use LINQ to SQL to query that view where the row number is mod zero for n in "take every n rows." 2) Use the accepted answer and ExecuteQuery. I prefer the first solution if I know the pre-mod sort should always be the same. – bopapa_1979 Nov 13 '13 at 18:37

6 Answers6

9

Sometimes, TSQL is the way to go. I would use ExecuteQuery<T> here:

    var data = db.ExecuteQuery<SomeObjectType>(@"
SELECT * FROM 
(SELECT *, ROW_NUMBER() OVER (ORDER BY id) AS [__row]
FROM [YourTable]) x WHERE (x.__row % 25) = 1");

You could also swap out the n:

    var data = db.ExecuteQuery<SomeObjectType>(@"
DECLARE @n int = 2
SELECT * FROM 
(SELECT *, ROW_NUMBER() OVER (ORDER BY id) AS [__row]
FROM [YourTable]) x WHERE (x.__row % @n) = 1", n);
Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
4

Once upon a time, there was no such thing as Row_Number, and yet such queries were possible. Behold!

var query =
  from c in db.Customers
  let i = (
    from c2 in db.Customers
    where c2.ID < c.ID
    select c2).Count()
  where i%3 == 0
  select c;

This generates the following Sql

SELECT [t2].[ID], [t2]. --(more fields)
FROM (
    SELECT [t0].[ID], [t0]. --(more fields)
(
        SELECT COUNT(*)
        FROM [dbo].[Customer] AS [t1]
        WHERE [t1].[ID] < [t0].[ID]
        ) AS [value]
    FROM [dbo].[Customer] AS [t0]
    ) AS [t2]
WHERE ([t2].[value] % @p0) = @p1
Amy B
  • 108,202
  • 21
  • 135
  • 185
  • 4
    -1 This is an impressively slow query! No matter if the table has 10 records or 10000 records, pulling it into memory is faster than this query. – diceguyd30 Nov 18 '10 at 15:32
  • This query also doesn't return the first element in the table. – diceguyd30 Nov 18 '10 at 15:35
  • Yeah, it's pretty slow. See my edit removing the equality from the comparison to get the first element. – Amy B Nov 18 '10 at 15:37
  • This query also assumes that the results are intended to be ordered by ID, which is probably not a valid assumption. – StriplingWarrior Nov 18 '10 at 16:52
  • @StriplingWarrior, when I wrote it doesn't compile he didn't use `let i`. – Saeed Amiri Nov 18 '10 at 19:42
  • @Saeed: the revision history indicates otherwise: http://stackoverflow.com/posts/4216061/revisions – StriplingWarrior Nov 18 '10 at 21:55
  • @StriplingWarrior, yes may be I didn't see that. – Saeed Amiri Nov 19 '10 at 09:15
  • OP here. Since somebody added a bounty, I'm back poking at this. This answer runs very fast on a table with about 1500 records. But when I run the same query against a table with 300,000, the select COUNT really bogs down, even with the optimization in the edit from @DavidB. For large tables, this doesn't seem to be a workable solution at all. – bopapa_1979 Nov 13 '13 at 17:48
  • @Eric yes, this answer is O(n^2). I only added it as a technical curiousity since none of the other answers were single roundtrip linqToSql queries. If I were faced with this problem in production, I would x) read the table into memory and filter there, y) modify the table structure so that "every xth row" is a sargable and indexable criteria, z) use rownumber and not use linqToSql. Those options are not ordered by preference. – Amy B Nov 13 '13 at 21:36
  • @DavidB - It is still fun to toy around with. I got it to work on a million row table yesterday with a few indexes and moving the data files to an FusionIO card. It still took a few minutes to return, and used lots of memory, but it DID work. Anyway, thanks for posting it. It was an interesting option. The other options you mentioned in your recent comment are, of course, much more workable in the wild. Years ago when I published this, we went with option Z. – bopapa_1979 Nov 14 '13 at 20:06
3

Here's an option that works, but it might be worth checking that it doesn't have any performance issues in practice:

var nth = 3;
var ids = Table
            .Select(x => x.Id)
            .ToArray()
            .Where((x, n) => n % nth == 0)
            .ToArray();

var nthRecords = Table
                   .Where(x => ids.Contains(x.Id));
Enigmativity
  • 113,464
  • 11
  • 89
  • 172
  • 2
    This will have problems when the table exceeds 7000 rows, as sql server's parameter limit will be reached. – Amy B Nov 18 '10 at 15:29
1

There really doesn't seem to be an easy way to do this:

How do I add ROW_NUMBER to a LINQ query or Entity?

How to find the ROW_NUMBER() of a row with Linq to SQL

But there's always:

peopleToFilter.AsEnumerable().Where((x,i) => i % AmountToSkipBy == 0)

NOTE: This still doesn't execute on the database side of things!

Community
  • 1
  • 1
diceguyd30
  • 2,742
  • 20
  • 18
  • Unfortunately that overload of `Where` is not supported in LINQ-to-SQL – Stephan Nov 17 '10 at 19:51
  • If you are executing in application instead of in database, you may as well go back to your original code, just convert to Enumerable. `peopleYouWantToFilter.AsEnumerable().Where((x,i) => i % NumberYouWantToSkip == 0)` – Chris Shaffer Nov 17 '10 at 20:22
  • I was actually just getting ready to add that, but thank you! – diceguyd30 Nov 17 '10 at 20:25
  • I was checking that solution and found it to be faster than the extension function so I just replaced it instead. – diceguyd30 Nov 17 '10 at 20:27
  • 1
    +1 in spirit... I can't in good conscience recommend the solution because it is still in memory, but out of the in memory solutions, this at least is the most concise way to do it :) – Chris Shaffer Nov 17 '10 at 20:31
  • -1, answering a LinqToSql question with a Linq answer is like trying to play frisbee with a yoyo. – Amy B Nov 18 '10 at 14:56
  • 1
    Unless the yoyo executes faster. My linq query runs at 3.076 seconds on a table with 5000 records whereas your linq to sql query runs at 7.417 seconds. – diceguyd30 Nov 18 '10 at 15:16
1

Just googling around a bit I haven't found (or experienced) an option for Linq to SQL to directly support this.

The only option I can offer is that you write a stored procedure with the appropriate SQL query written out and then calling the sproc via Linq to SQL. Not the best solution, especially if you have any kind of complex filtering going on.

Chris Shaffer
  • 32,199
  • 5
  • 49
  • 61
  • Sometimes, as it seems in this case, the only answer is that there is no answer. Without a way to directly user `ROW_NUMBER()` from SQL this type of selection is likely impossible. – Stephan Nov 17 '10 at 20:15
  • I think there is a way like what I wrote, just should think about it in another way. – Saeed Amiri Nov 17 '10 at 21:56
1

This will do the trick, but it isn't the most efficient query in the world:

var count = query.Count();
var pageSize = 10;
var pageTops = query.Take(1);
for(int i = pageSize; i < count; i += pageSize)
{
    pageTops = pageTops.Concat(query.Skip(i - (i % pageSize)).Take(1));
}
return pageTops;

It dynamically constructs a query to pull the (nth, 2*nth, 3*nth, etc) value from the given query. If you use this technique, you'll probably want to create a limit of maybe ten or twenty names, similar to how Google results page (1-10, and Next), in order to avoid getting an expression so large the database refuses to attempt to parse it.

If you need better performance, you'll probably have to use a stored procedure or a view to represent your query, and include the row number as part of the stored proc results or the view's fields.

StriplingWarrior
  • 151,543
  • 27
  • 246
  • 315