-1

user

id name age
1 anna 6
2 john 10
3 lord 50

cats

id name userID
1 miez 1
2 caty 1
3 random 2
4 idk 3

When using

SELECT U.id, C.name FROM user U
INNER JOIN cats C ON U.id = C.id
LIMIT 2

I get as a

result

UserID CatName
1 miez
1 caty

What I want is to limit my rows by the distinct values of UserID, like this

SELECT U.id, C.name FROM user U
INNER JOIN cats C ON U.id = C.id
LIMIT 2 <distinct U.id rows>
UserID CatName
1 miez
1 caty
2 random

People suggested using limit in subqueries and check if UserID is in the return like

... WHERE UserID IN (SELECT id FROM User LIMIT 2)

but this only works well for small tables and is not an elegant solution for good performance. My idea was using DENSE_RANK(), like:

SELECT U.id, C.name FROM user U
DENSE_RANK() OVER (ORDER BY U.id) as rows,
INNER JOIN cats C ON U.id = C.id
WHERE rows < 50

but it is not working either.

user16405471
  • 61
  • 1
  • 7

1 Answers1

1

You can't use a column alias on the same level where you define it. You will have to wrap the query in a derived table. However if you want a specific number of row per user you need to use partition by, not order by

select id, name
from (
  SELECT u.id, 
         c.name, 
         DENSE_RANK() OVER (PARTITION BY U.id ORDER BY c.name) as rnk
  FROM user U
    JOIN cats C ON U.id = C.userid
) t
WHERE t.rnk <= 2
  • thx for your help. but that would be a solution for smaller tables. you use a subquery for user but don´t limit the results. that´s why I mentioned "without subqueries" because in realitiy I have pretty large tables and can not just query through user with 100000 entries – user16405471 Nov 04 '22 at 12:06
  • 1
    The derived table is only syntactic sugar to allow a reference to the column alias in the WHERE clause. It has no impact on the performance of evaluating the `dense_rank()`. 100000 rows isn't considered "pretty large" these days. In fact I consider that a small table. –  Nov 04 '22 at 12:17