selecting data in big table for which certain IDs don't exist in same columns. Speed up query

Question

I'm trying to select data in a table for companies and dates that don't exist for a different type/id of data.

Put another way, I want company_id, dates_id, daily_val where wh_calc_id = 344 if the same company_id/dates_id combination doesn't exist where wh_calc_id = 368.

I'm loosely following this example: Select rows which are not present in other table

These are my two attempts at it:

attempt 1:

SELECT distinct on (company_id, dates_id) company_id, dates_id, daily_val
FROM   daily_data d1
WHERE  NOT EXISTS (
                       SELECT 1              
                       FROM   daily_data d2
                       WHERE  d1.company_id = d2.company_id
                              and d1.dates_id = d2.dates_id
                              and d1.wh_calc_id = 368
                              and d2.wh_calc_id = 368
                   )
    and d1.wh_calc_id = 344

The problem: It's super slow: 27 minutes

attempt 2: [removed]

All in one (giant) table: company_id int (indexed), dates_id int (indexed), wh_calc_id int (indexed), daily_val numeric

I'm open to adding an index that would help speed things up, but what index?

Postgres 10

~~PS - I've had to kill both queries before they completed, so I don't really know if they are written correctly. Hopefully my description helps.~~

I would do it whit a left join this way: SELECT distinct on (company_id, dates_id) company_id, dates_id, daily_val FROM daily_data d1 LEFT JOIN daily_data d2 ON d1.company_id = d2.company_id and d1.dates_id = d2.dates_id and d1.wh_calc_id = 368 and d2.wh_calc_id = 368 WHERE d1.wh_calc_id = 344 AND d2.company_id IS NULL; and create the index over the columns to use: Create index on table daily_data ( company_id, dates_id, wh_calc_id); — Abdel P., Nov 16 '17 at 15:49

Abdel P. · Answer 1 · 2017-11-17T10:35:44.350

0

I would do it with a left join this way:

SELECT distinct on (company_id, dates_id) company_id, dates_id, daily_val FROM daily_data d1 LEFT JOIN daily_data d2 ON d1.company_id = d2.company_id and d1.dates_id = d2.dates_id and d1.wh_calc_id = 368 and d2.wh_calc_id = 368 WHERE d1.wh_calc_id = 344 AND d2.company_id IS NULL;

and create the index over the columns to use:

Create index on table daily_data ( company_id, dates_id, wh_calc_id);

edited Nov 17 '17 at 10:35

answered Nov 16 '17 at 22:48

Abdel P.

739
9
24

turns out that didn't do what I wanted. – mountainclimber11 Nov 20 '17 at 21:02

mountainclimber11 · Accepted Answer · 2017-11-20T21:55:48.800

0

This does what I want I think:

   SELECT 
    d1.* 
from 
    daily_data d1
LEFT JOIN
    daily_data d2
ON
    d1.company_id     = d2.company_id
    AND d1.dates_id   = d2.dates_id
    AND d2.wh_calc_id = 368
    AND d1.wh_calc_id = 344

where
   and  d1.wh_calc_id = 344
   and d2.wh_calc_id is null

edited Nov 20 '17 at 21:55

answered Nov 20 '17 at 21:18

mountainclimber11

1,339
1
28
51

selecting data in big table for which certain IDs don't exist in same columns. Speed up query

2 Answers2