I've been using PosgreSQL almost daily for over 11 years now, and today I wrote what I though was a very simple query with a LEFT JOIN that doesn't behave the way that I expected. I'm lucky I caught the bug, but it has me concerned that there is something fundamental here that I a missing. Please look at the following to be able reproduce.
CREATE TEMP TABLE tbl_a(date date);
INSERT INTO tbl_a VALUES ('2022-01-01'), ('2022-01-02'), ('2022-01-03'), ('2022-01-04');
CREATE TEMP TABLE sale(date date, item_id int);
INSERT INTO sale VALUES ('2022-01-02', 2), ('2022-01-03', 2), ('2022-01-04', 3);
When I run the following query I get the results I expect with a LEFT JOIN
SELECT t.*, s.item_id FROM tbl_a AS t LEFT JOIN sale AS s ON t.date = s.date;
+------------+---------+
| date | item_id |
+------------+---------+
| 2022-01-01 | NULL |
| 2022-01-02 | 2 |
| 2022-01-03 | 2 |
| 2022-01-04 | 3 |
+------------+---------+
I get every record in tbl_a and since I have no sale records for 2022-01-01, I get a NULL.
However, when I add a WHERE to the query I get an unexpected result.
SELECT t.*, s.item_id FROM tbl_a AS t LEFT JOIN sale AS s ON t.date = s.date WHERE s.item_id = 2;
+------------+---------+
| date | item_id |
+------------+---------+
| 2022-01-02 | 2 |
| 2022-01-03 | 2 |
+------------+---------+
Note: there is no record for 2022-01-01 or 2022-01-04.
However, if I rewrite the query with a CTE, I get the results I expect.
WITH s AS (select * from sale WHERE item_id = 2) SELECT t.*, s.item_id FROM tbl_a AS t LEFT JOIN s ON t.date = s.date ORDER BY t.date;
+------------+---------+
| date | item_id |
+------------+---------+
| 2022-01-01 | NULL |
| 2022-01-02 | 2 |
| 2022-01-03 | 2 |
| 2022-01-04 | NULL |
+------------+---------+
My question is why do the above two queries yield different results.
Note:
SELECT version();
+-----------------------------------------------------------------------------------------------------------------------------------+
| version |
+-----------------------------------------------------------------------------------------------------------------------------------+
| PostgreSQL 13.7 (Ubuntu 13.7-1.pgdg20.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0, 64-bit |
+-----------------------------------------------------------------------------------------------------------------------------------+