1

I came across some weird behavior today with postgresql.

WITH actual_prices AS (
    -- Looking for prices from now to the given number of days back
    SELECT *
    FROM prices
    WHERE price_date >= now()::date - 93
)
, distinct_products_sold AS (
    SELECT distinct(id_product) as pid FROM products_sold
)
, first_prices AS (
    SELECT s.pid, p.product_id, p.price_date, p.price
    FROM   distinct_products_sold s
    LEFT   JOIN actual_prices p ON p.product_id = s.pid
)
select * from first_prices;

This code outputs something of this kind:

129 | | |
195 | | |
251 | | |
...

In other words, columns of table actual_prices are empty. I tried messing around with JOIN just to see what's going on: if I do RIGHT JOIN instead of LEFT JOIN, it empties the column of distinct_products_sold but the columns of actual_prices are displayed correctly. What can cause this?

khajvah
  • 4,889
  • 9
  • 41
  • 63
  • 1
    Distinct is not a function (on a column), it's a part of SELECT DISTINCT - and works on the whole selected rows. `SELECT distinct(id_product) ...` is the same as `SELECT DISTINCT id_product ...`. – jarlh Aug 18 '16 at 09:22
  • 1
    Why is it weird? There probably just aren't any matching rows in either table where `p.product_id = s.pid`. – GarethD Aug 18 '16 at 09:24
  • The nature of joins is that they "empty" records out. In the case of a `LEFT JOIN`, the records on the left side of the join will all be retained even if nothing matches, and the opposite is true for a `RIGHT JOIN`. – Tim Biegeleisen Aug 18 '16 at 09:25
  • @GarethD It does have an output, so there are matches. But it returns empty columns. – khajvah Aug 18 '16 at 09:25
  • 2
    That would be true for an `INNER JOIN`, but the very nature of a `LEFT/RIGHT/FULL OUTER JOIN` is that there does not have to be a match in both tables for rows to be returned. [This answer on types of `JOIN`](http://stackoverflow.com/a/20298671/1048425) should help you. – GarethD Aug 18 '16 at 09:27
  • 1
    @GarethD Thanks a lot. Weird confusion. You can write a simple answer so I can accept something. – khajvah Aug 18 '16 at 09:34

3 Answers3

2

You have it the wrong way around: it is not that the outer join causes data to be lost from one table, rather it forces a union between the tables by padding the missing columns with nulls e.g.

WITH P ( PID ) AS
(
 SELECT *
   FROM (
         VALUES ( 1 ), ( 2 ), ( 3 )
        ) AS T ( C )
),
Q ( QID ) AS
(
 SELECT *
   FROM (
         VALUES ( 4 ), ( 5 ), ( 6 )
        ) AS T ( C )
)
SELECT p.PID, q.QID
  FROM P p, Q q
 WHERE p.PID = q.QID
UNION
SELECT p.PID, NULL
  FROM P p
 WHERE p.PID NOT IN ( SELECT QID FROM Q );
onedaywhen
  • 55,269
  • 12
  • 100
  • 138
0

Forgive me for my brainfart. Turns out it output unmatched results(how surprising). LEFT/RIGHT Joins also output unmatched results of left or right table.

P.S. Have a launch before posting a question.

khajvah
  • 4,889
  • 9
  • 41
  • 63
-1

No need for WITH clause here , try this:

SELECT t.pid , p.product_id, p.price_date, p.price
FROM (SELECT distinct id_product as pid FROM products_sold) t
LEFT JOIN prices p 
 ON(t.pid = p.product_id AND p.price_date >= now()::date - 93)

If all the columns from table prices are still NULL, then there are just no matches.

A left join keeps all the records from the leading table(the left table) and only the matched data from the right table.

sagi
  • 40,026
  • 6
  • 59
  • 84