0

This is similar to How to find missing data rows using SQL? and How to find missing rows (dates) in a mysql table? but a bit more complex, so I'm hitting a wall.

I have a data table with the noted Primary key:

country_id (PK)
product_id (PK)
history_date (PK)
amount

I have a products table with all products, a countries table, and a calendar table with all valid dates.

I'd like to find all countries, dates and products for which there are missing products, with this wrinkle: I only care about dates for which there are entries for a country for at least one product (i.e. if the country has NOTHING on that day, I don't need to find it) - so, by definition, there is an entry in the history table for every country and date I care about.

I know it's going to involve some joins maybe a cross join, but I'm hitting a real wall in finding missing data.

I tried this (pretty sure it wouldn't work):

SELECT h.history_date, h.product_id, h.country_id, h.amount
FROM products p
LEFT JOIN history h ON (p.product_id = h.product_id)
WHERE h.product_id IS NULL

No Joy.

I tried this too:

WITH allData AS (SELECT h1.country_id, p.product_id, h1.history_date
FROM products p 
CROSS JOIN (SELECT DISTINCT country_id, history_date FROM history) h1)
SELECT f.history_date, f.product_id, f.country_id
FROM allData f
LEFT OUTER JOIN history h ON (f.country_id = h.country_id AND f.history_date = h.history_date AND f.product_id = h.product_id)
WHERE h.product_id IS NULL 
   AND h.country_id IS NOT NULL 
   AND h.history_date IS NOT null

also no luck. The CTE does get me every product on every date that there is also data, but the rest returns nothing.

Marc
  • 3,259
  • 4
  • 30
  • 41

1 Answers1

1

I only care about dates for which there are entries for a country for at least one product (i.e. if the country has NOTHING on that day, I don't need to find it)

So we care about this combination:

from    (select distinct country_id, history_date from history) country_date
cross join products p

Then it's just a matter of checking for existence:

select *
from    (select distinct country_id, history_date from history) country_date
cross join products p
where not exists (select null
                  from history h
                  where country_date.country_id   = h.country_id
                   and  country_date.history_date = h.history_date
                   and  p.product_id              = h.product_id
                 ) 
Andrew Sayer
  • 2,296
  • 1
  • 7
  • 9
  • So I was close, but no cigar. This is great, thanks! – Marc Oct 28 '20 at 21:38
  • To correct your version, you just need to look at the filters you are using against your outer joined table. `AND h.country_id IS NOT NULL ` is only going to be true if there is a row matched on all of the outer join conditions. – Andrew Sayer Oct 28 '20 at 21:43