sql server: indexed view containing greatest row per group

Question

I'm trying to make an indexed view that computes the greatest n per group for a table in my database. I've tried three different queries for the view though, and they all seem to be not allowed by sql server. The table is just a list of files, each has an id, a userid and a date, and i'm trying to find the files for each user that are the newest.

I tried the 2 suggestions here, but they gave me errors saying I wasn't allowed to do group by or outer joins in an indexed view. I'll copy my implementations here so you can see.

select f1.id, f1.userid, f1.filedate
from files f1
inner join(
   select userid, max(filedate) as maxDate
   from files
   group by userid
) as f2
on f1.userid = f2.userid and f1.maxdate = f2.maxdate

or

select f1.id, f1.userid, f1.filedate
from files f1
left outer join files f2
on (f1.userid = f2.userid and f1.filedate < f2.filedate)
where f2.id is null;

Both of those queries worked, but creating the index on the view failed because it said group bys and outer joins were not allowed.

I also tried

SELECT f1.id, f1.userid, f1.filedate FROM [dbo].[DiagnosticFiles] as f1
WHERE NOT EXISTS (
   SELECT f1.id FROM [dbo].[DiagnosticFiles] as f2 INNER JOIN [dbo].[DiagnosticFiles] as f3 ON f2.userid = f3.userid
WHERE f1.id = f2.id AND f2.FileDate < f3.FileDate)

and it said subqueries were not allowed when I tried to add it to the index

I have an idea of how to do this with only inner joins, but the problem is it seems to require n-1 joins, where n is the maximum number of files a single userid has. here is the basic query

SELECT DISTINCT f1.id, f1.userid, f1.filedate
FROM FILES f1
INNER JOIN files f2
ON (f1.userid = f2.userid and f1.filedate > f2.filedate)

This only works if there are at most 2 rows for any given userid. See Here. If I added this to the end of the query

INNER JOIN files f3
ON (f2.userid = f3.userid and f2.filedate > f3.filedate)

then it would work up to 3 files per user, but you see the problem.

Is there a reasonable way to make an indexed view for this kind of query?

Thanks.

EDIT: In addition, if there is no way to do the indexed view, and i left it as an unindexed view that basically did the query every time, would that be very slow for huge numbers of rows, and probably something in the hundreds of rows per user? I was planning on making some nonclustered indexes on userid and filedate

score 2 · Accepted Answer · answered Apr 18 '13 at 21:47

2

No, thats not possible. It would be so great if it was. I suggest you upvote all indexed view tickets on http://connect.microsoft.com that you can find (some are mine) to get some priority on that feature.

Some of your ideas also fail on the fact that self-joins are not allowed, either. It really is faster to list what is allowed that what's not.

At best you can materialize a part of your query:

   select userid, max(filedate) as maxDate
   from files
   group by userid

And add a useful index on files for joining back to it. This will probably give you a very cheap query overall.

answered Apr 18 '13 at 21:47

usr

168,620
35
240
369

would that be better than just leaving one of those queries as an unindexed view and properly indexing the original table? – bdwain Apr 18 '13 at 21:50
Yes because an unindexed view does not help at all. Look at the query plans and compare them to determine the impact. – usr Apr 18 '13 at 22:00

sql server: indexed view containing greatest row per group

1 Answers1