0

I have a table like so:

create table cars (
    id serial primary key,
    au_rating integer,
    year integer
);

insert into cars (au_rating, year) VALUES 
  (1,2019),
  (1,2020),
  (1,2016),
  (2,2016),
  (2,2019),
  (3,2020),
  (3,2018),
  (3,2017),
  (3,2019),
  (3,2020),
  (0,2020),
  (0,2006)
;

I want to select N first elements in each group using a JOIN LATERAL (similar to this: https://stackoverflow.com/a/37862028/1002814):

SELECT cars_outer.id, cars_top.au_rating, cars_top.year
FROM cars cars_outer
JOIN LATERAL (
    SELECT * FROM cars cars_inner
    WHERE cars_inner.au_rating = cars_outer.au_rating
    ORDER BY ABS(cars_inner.year - 2019)
    LIMIT 2
) cars_top on true
ORDER BY cars_outer.au_rating DESC

However as I only want a max of 2 rows in each group of au_rating, I supplied a LIMIT 2 to the lateral join, however it seems to be ignored - why?

http://sqlfiddle.com/#!17/c0704/5

GMB
  • 216,147
  • 25
  • 84
  • 135
Niels Kristian
  • 8,661
  • 11
  • 59
  • 117

3 Answers3

2

If you want the top-2 records per rating group, then I would recommend filtering with window functions.

The logic you seem to want is:

select (c.c).*
from (
    select c, row_number() over(partition by au_rating order by year) rn
    from cars c
) c
where rn <= 2

I am unsure about how you define the "top 2" records. This sorts records by year, so you get the two oldest records per au_rating. You can change the order by clause to your actual requirement.

Demo on DB Fiddle:

id | au_rating | year
-: | --------: | ---:
12 |         0 | 2006
11 |         0 | 2020
 3 |         1 | 2016
 1 |         1 | 2019
 4 |         2 | 2016
 5 |         2 | 2019
 8 |         3 | 2017
 7 |         3 | 2018
GMB
  • 216,147
  • 25
  • 84
  • 135
1

GMB's answer with the window functions that you accepted is best.

The LIMIT was being honored in your LATERAL JOIN query. The problem was that records repeat and you omitted DISTINCT.

Update

I failed to mention that you want cars_top.id instead of cars_outer.id, and the DISTINCT means the order_by column has to be in the result.

SELECT DISTINCT cars_top.id, cars_top.au_rating, cars_top.year
    FROM cars cars_outer
    JOIN LATERAL (
        SELECT * FROM cars cars_inner
        WHERE cars_inner.au_rating = cars_outer.au_rating
        ORDER BY ABS(cars_inner.year - 2019)
        LIMIT 2
    ) cars_top on true
  ORDER BY cars_top.au_rating DESC
Mike Organek
  • 11,647
  • 3
  • 11
  • 26
1

You query is doing exactly what you are requesting. For each car in you table, it is returning to cars that match.

You can do what you want by having the first "table" identify only the ratings you want:

SELECT cars_top.au_rating, cars_top.year
FROM (SELECT DISTINCT au_rating FROM cars
     ) cars_outer CROSS JOIN LATERAL
     (SELECT *
      FROM cars cars_inner
      WHERE cars_inner.au_rating = cars_outer.au_rating
      ORDER BY ABS(cars_inner.year - 2019)
      LIMIT 2
     ) cars_top 
ORDER BY cars_outer.au_rating DESC;

Note that if you use CROSS JOIN LATERAL, you don't need an ON clause.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786