0

Assume I have the following table:

CREATE TABLE test_UserAds
(
    UserId int,
    Title varchar(255),
    ActivateDate DATETIME2
);

And this test data:

INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 1, 'Waiter', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 1, 'Policeman', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 1, 'Data Clerk', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 1, 'Nurse', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 2, 'Chef', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 2, 'Teacher', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 2, 'Mechanic', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 2, 'Cleaner', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 2, 'Painter', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 2, 'Bricklayer', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 2, 'Plasterer', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 2, 'Electrician', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 3, 'Programmer', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 3, 'Driver', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 3, 'Photographer', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 3, 'Carpenter', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 3, 'CEO of some nasty Corporation', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 3, 'Writer', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 3, 'Labourer', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 3, 'Stack Overflow Bore', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 4, 'Surgeon', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 4, 'Diver', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 4, 'Chicken sexer', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 4, 'Bottle washer', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 4, 'Food Tester', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 4, 'Chemist', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 4, 'Biologist', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 4, 'Secretary', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 5, 'Doctor', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 5, 'Developer', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 5, 'Designer', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 5, 'Tramp', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 5, 'Clock watcher', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 5, 'Pedant', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 5, 'Neuromancer', SYSDATETIME())
INSERT INTO dbo.test_UserAds ( UserId, Title, ActivateDate ) VALUES  ( 5, 'Gardener', SYSDATETIME())

I want to be able to return the top 15 rows ordered by ActivateDate (desc) but no more than 3 for each user. Can anyone suggest something?

The expected result set would be something like the following:

1, Waiter
1, Policeman
1, Data Clerk
2, Chef
2, Teacher
2, Mechanic
3, Programmer
3, Driver,
3, Photographer
4, Surgeon
4, Diver
4, Chicken sexer
5, Doctor
5, Developer
5, Designer
Norbert Norbertson
  • 2,102
  • 1
  • 16
  • 28

1 Answers1

1

In SQL Server, you can use row_Number() to get the three most recent for each user. Then just take the most recent 15 of those:

select top 15 ua.*
from (select ua.*,
             row_number() over (partition by userid order by activatedate desc) as seqnum
      from dbo.test_UserAds ua
     ) ua
where seqnum <= 3  -- get three most recent for each user
order by activatedate desc;

If you have a table of users, you can also do:

select top 15 ua.*
from users u cross apply
     (select top 3 ua.*
      from dbo.test_UserAds ua
      where ua.userid = u.userid
      order by activatedate desc
     ) ua
order by activatedate desc;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • many thanks Gordon. That's excellent. There is a user table so I will also try the second one tomorrow – Norbert Norbertson Dec 05 '16 at 22:49
  • Just wondering if there are any pros and cons of the above two approaches? – Norbert Norbertson Dec 06 '16 at 08:51
  • @NorbertNorbertson . . . `apply` surprises me as being surprisingly good in performance. Under some circumstances, the `apply` version might work much better (with appropriate indexes), particularly if each user has lots of ads -- think hundreds or thousands per user. – Gordon Linoff Dec 06 '16 at 13:01
  • Thanks Gordon. Most users have between 10 and 50 ads but they could have hundreds. I'll see which works best. – Norbert Norbertson Dec 06 '16 at 23:11
  • discovered that in our scenario the `cross apply` method is significantly slower than the `row_number()` method. I am going to review the indexes and do more testing. – Norbert Norbertson Dec 07 '16 at 13:28
  • @NorbertNorbertson . . . Thank you for the update. For performance in either case, you want an index on `test_UserAds(userid, activatedate)`. – Gordon Linoff Dec 07 '16 at 13:30