0

I'm trying to select data in a table for companies and dates that don't exist for a different type/id of data.

Put another way, I want company_id, dates_id, daily_val where wh_calc_id = 344 if the same company_id/dates_id combination doesn't exist where wh_calc_id = 368.

I'm loosely following this example: Select rows which are not present in other table

These are my two attempts at it:

attempt 1:

SELECT distinct on (company_id, dates_id) company_id, dates_id, daily_val
FROM   daily_data d1
WHERE  NOT EXISTS (
                       SELECT 1              
                       FROM   daily_data d2
                       WHERE  d1.company_id = d2.company_id
                              and d1.dates_id = d2.dates_id
                              and d1.wh_calc_id = 368
                              and d2.wh_calc_id = 368
                   )
    and d1.wh_calc_id = 344

The problem: It's super slow: 27 minutes

attempt 2: [removed]

All in one (giant) table: company_id int (indexed), dates_id int (indexed), wh_calc_id int (indexed), daily_val numeric

I'm open to adding an index that would help speed things up, but what index?

Postgres 10

PS - I've had to kill both queries before they completed, so I don't really know if they are written correctly. Hopefully my description helps.

mountainclimber11
  • 1,339
  • 1
  • 28
  • 51
  • 1
    I would do it whit a left join this way: SELECT distinct on (company_id, dates_id) company_id, dates_id, daily_val FROM daily_data d1 LEFT JOIN daily_data d2 ON d1.company_id = d2.company_id and d1.dates_id = d2.dates_id and d1.wh_calc_id = 368 and d2.wh_calc_id = 368 WHERE d1.wh_calc_id = 344 AND d2.company_id IS NULL; and create the index over the columns to use: Create index on table daily_data ( company_id, dates_id, wh_calc_id); – Abdel P. Nov 16 '17 at 15:49

2 Answers2

0

I would do it with a left join this way:

SELECT distinct on (company_id, dates_id) company_id, dates_id, daily_val FROM daily_data d1 LEFT JOIN daily_data d2 ON d1.company_id = d2.company_id and d1.dates_id = d2.dates_id and d1.wh_calc_id = 368 and d2.wh_calc_id = 368 WHERE d1.wh_calc_id = 344 AND d2.company_id IS NULL;

and create the index over the columns to use:

Create index on table daily_data ( company_id, dates_id, wh_calc_id);

Abdel P.
  • 739
  • 9
  • 24
0

This does what I want I think:

   SELECT 
    d1.* 
from 
    daily_data d1
LEFT JOIN
    daily_data d2
ON
    d1.company_id     = d2.company_id
    AND d1.dates_id   = d2.dates_id
    AND d2.wh_calc_id = 368
    AND d1.wh_calc_id = 344

where
   and  d1.wh_calc_id = 344
   and d2.wh_calc_id is null
mountainclimber11
  • 1,339
  • 1
  • 28
  • 51