I want to SELECT
only the data where the create_dt
or last_dt
of each table is the latest.
Then you certainly cannot join on idx
since the latest per table won't share the same idx
value.
Your query does not do what you say. This one does:
SELECT *
FROM (
(
SELECT GREATEST(create_dt, last_dt) AS latest_a, *
FROM a
WHERE create_dt BETWEEN '2023-05-01' AND '2023-05-31'
ORDER BY create_dt DESC
LIMIT 1
)
UNION ALL
(
SELECT GREATEST(create_dt, last_dt) AS latest_a, *
FROM a
WHERE last_dt BETWEEN '2023-05-01' AND '2023-05-31'
ORDER BY last_dt DESC
LIMIT 1
)
ORDER BY latest_a
LIMIT 1
) a
CROSS JOIN (
(
SELECT GREATEST(create_dt, last_dt) AS latest_b, *
FROM b
WHERE create_dt BETWEEN '2023-05-01' AND '2023-05-31'
ORDER BY create_dt DESC
LIMIT 1
)
UNION ALL
(
SELECT GREATEST(create_dt, last_dt) AS latest_b, *
FROM b
WHERE last_dt BETWEEN '2023-05-01' AND '2023-05-31'
ORDER BY last_dt DESC
LIMIT 1
)
ORDER BY latest_b
LIMIT 1
) b
CROSS JOIN (
(
SELECT GREATEST(create_dt, last_dt) AS latest_c, *
FROM a
WHERE create_dt BETWEEN '2023-05-01' AND '2023-05-31'
ORDER BY create_dt DESC
LIMIT 1
)
UNION ALL
(
SELECT GREATEST(create_dt, last_dt) AS latest_c, *
FROM a
WHERE last_dt BETWEEN '2023-05-01' AND '2023-05-31'
ORDER BY last_dt DESC
LIMIT 1
)
ORDER BY latest_c
LIMIT 1
) c -- USING (idx);
All parentheses required.
A bit verbose. But it's as fast as this gets - provided you have these indexes:
CREATE INDEX a_create_dt_idx ON A (create_dt);
CREATE INDEX a_last_dt_idx ON A (last_dt);
CREATE INDEX b_create_dt_idx ON B (create_dt);
CREATE INDEX b_last_dt_idx ON B (last_dt);
CREATE INDEX c_create_dt_idx ON C (create_dt);
CREATE INDEX c_last_dt_idx ON C (last_dt);
It will be two index seeks per table, directly picking the one qualifying row every time.
I am joining with an unconditional CROSS JOIN
, since each subquery returns exactly one row, provided at least one qualifies.
If one of the subqueries finds no row, the result is empty. Maybe you really want a FULL OUTER JOIN
to preserve results from the other tables if one comes up empty. Or just 3 result rows.
Then again, I wouldn't be surprised if you didn't exactly say what you really need. My educated guess: you want something like this (the latest row per idx from each table):
(
SELECT DISTINCT ON (idx) 'a_latest_create_dt' AS what, *
FROM a
WHERE create_dt BETWEEN '2023-05-01' AND '2023-05-31'
ORDER BY idx DESC, create_dt DESC, last_dt DESC
)
UNION ALL
(
SELECT DISTINCT ON (idx) 'a_latest_last_dt' AS what, *
FROM a
WHERE last_dt BETWEEN '2023-05-01' AND '2023-05-31'
ORDER BY idx DESC, last_dt DESC, create_dt DESC
)
UNION ALL
(
SELECT DISTINCT ON (idx) 'b_latest_create_dt' AS what, *
FROM b
WHERE create_dt BETWEEN '2023-05-01' AND '2023-05-31'
ORDER BY idx DESC, create_dt DESC, last_dt DESC
)
UNION ALL
(
SELECT DISTINCT ON (idx) 'b_latest_last_dt' AS what, *
FROM b
WHERE last_dt BETWEEN '2023-05-01' AND '2023-05-31'
ORDER BY idx DESC, last_dt DESC, create_dt DESC
)
UNION ALL
(
SELECT DISTINCT ON (idx) 'c_latest_create_dt' AS what, *
FROM c
WHERE create_dt BETWEEN '2023-05-01' AND '2023-05-31'
ORDER BY idx DESC, create_dt DESC, last_dt DESC
)
UNION ALL
(
SELECT DISTINCT ON (idx) 'c_latest_last_dt' AS what, *
FROM c
WHERE last_dt BETWEEN '2023-05-01' AND '2023-05-31'
ORDER BY idx DESC, last_dt DESC, create_dt DESC
)
ORDER BY idx, what;
This time I list the latest rows per idx for each table: one for create_dt
, one for last_dt
. Makes 6 rows if there is one in the time frame for every flavor.
About DISTINCT ON
:
Assuming all involved columns are NOT NULL
. Else you may have to do more ...
Matching indexes:
CREATE INDEX a_create_dt_idx ON A (idx, create_dt);
CREATE INDEX a_last_dt_idx ON A (idx, last_dt);
CREATE INDEX b_create_dt_idx ON B (idx, create_dt);
CREATE INDEX b_last_dt_idx ON B (idx, last_dt);
CREATE INDEX c_create_dt_idx ON C (idx, create_dt);
CREATE INDEX c_last_dt_idx ON C (idx, last_dt);