I'm trying to understand why the following query only returns one row in spite of LEFT JOIN
with t1(day_partition, entity_id, feature_1) AS (values
('2020-05-15', 'id_1', 'x'),
('2020-05-15', 'id_2', 'y')
),
t2(day_partition, entity_id, feature_2) AS (values
('2020-05-15', 'id_1', 1)
)
SELECT
t1.day_partition AS day_partition_1,
t2.day_partition AS day_partition_2,
t1.entity_id AS entity_id_1,
t2.entity_id AS entity_id_2
FROM
t1
LEFT JOIN
t2
ON
t1.entity_id = t2.entity_id
WHERE
t2.day_partition = '2020-05-15'
;
returns
day_partition_1 | day_partition_2 | entity_id_1 | entity_id_2
-----------------+-----------------+-------------+-------------
2020-05-15 | 2020-05-15 | o1 | o1
However, removing the filter of
WHERE
t2.day_partition = '2020-05-15'
will return
day_partition_1 | day_partition_2 | entity_id_1 | entity_id_2
-----------------+-----------------+-------------+-------------
2020-05-15 | 2020-05-15 | id_1 | id_1
2020-05-15 | NULL | id_2 | NULL
I find such behavior unintuitive and what's the rule behind it?