0

My query goes over three entities, so I have two left joins. I want to restrict the result of the first join. For example we have customers and order evaluations. In the first join I want only the youngest order evaluation that is hanging onto the customer. The older ones don't interest me. After this I join over another table (with dependency to customer and orderevaluation) and ask for certain conditions. I can't seem to find out how I can restrict the first join result with the help of the createstamp.

My query draft looks like this:

SELECT * FROM customer 
left join (SELECT * FROM orderevaluation ORDER BY createstamp desc LIMIT 1) o on customer.id = o.customer_id
left join ... WHERE ... AND ... ;

Here is the problem that there gets only one orderevaluation selected for all my orders and not one per order. I want to select the youngest orderevaluation per customer and then join again.

I also tried:

Selecting after the

customer.id = 

But there I could only work with the id and not with the createstamp.

I tried to use ORDER BY o.createstamp DESC limit 1 after the WHERE condition but it doesn't work, either.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
newbie
  • 3
  • 1
  • Crucial for the optimal query: do you run it for one or a few selected customers, or for most or all of the table at once? – Erwin Brandstetter Dec 20 '22 at 23:28
  • After the two left joins I have like four conditions for the customers that get selected but for the joins the whole customer-table is used, is it not? The conditions are all dependent on the other two entities, like the createstamp for the orderevaluation is not older than x and so one – newbie Dec 21 '22 at 13:38
  • That sounds like a broken query. See: https://stackoverflow.com/a/40477254/939860 – Erwin Brandstetter Dec 21 '22 at 14:42

2 Answers2

2

Typically, you run such a query for one or a few selected customers (filtered in the undisclosed outer WHERE clause). Then, a LATERAL subquery is typically fastest, as it only processes the few customers of interest instead of the whole table.

SELECT ...
FROM   customer c
LEFT   JOIN LATERAL (
   SELECT o.*
   FROM   orderevaluation o
   WHERE  o.customer_id = c.id
   ORDER  BY o.customer_id, o.createstamp DESC NULLS LAST
   LIMIT  1
   ) o ON true
LEFT  JOIN ... 
WHERE ...  -- filtering *few* customers

See:

About the LATERAL join:

Be sure to have an index with leading customer_id, ideally on orderevaluation(customer_id, customer_id, createstamp DESC NULLS LAST).

Drop NULLS LAST from index and query if the column is defined NOT NULL.

For additional joins, consider:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
1

I want to select the youngest orderevaluation per customer and then join again.

You would typically filter the first table in a subquery. In Postgres, one approach uses distinct on. Starting from your pseudo-code:

SELECT ...  -- enumerate wanted cols here 
FROM customer c
LEFT JOIN (
    SELECT DISTINCT ON (customer_id) o.*
    FROM orderevaluation o
    ORDER BY customer_id, createstamp DESC
) o on customer.id = o.customer_id
LEFR JOIN ... 
WHERE ...

Another approach would be to use lateral joins instead of left joins (and then we could indeed use limit, as in your attempt) - which would require more context about the design of your tables.

GMB
  • 216,147
  • 25
  • 84
  • 135