0

I have a two large tables.

In the first table have indexed id field with type int. In the second table have indexed classId field with type varchar(50). I need to get all fields from first table which do not have classId in the second.

select 
    a.id, a.type, a.path, a.filename
from 
    assets a
where 
    a.type="folder" and 
    concat_ws("", a.path, a.filename) like "/product%" AND
    a.id not in (
        select 
           classId 
        from 
           gridconfigs g 
        where 
           g.type='asset' and
           g.name='Photo Attributes'
    );

I am trying to do it in that way but request is executing very slowly.

Any idea?

Bohdan
  • 33
  • 7
  • 1
    Use Not exists... `SELECT * FROM assets A WHERE NOT EXISTS (SELECT 1 FROM gridconfigs G WHERE A.ID = G.CLASSID and G.type = 'asset' and G.name = 'Photo Attributes')` Implicit casting should take care of the variance. not in doesn't handle nulls well. – xQbert Dec 09 '22 at 19:11
  • Wow! It is works great! Can you explain please why it works match faster than my variant. – Bohdan Dec 09 '22 at 19:25
  • if you look at an execution plan you'll see an in must process every record to see if a.id = g.classID. `EXISTS` can early exit once a single instance is found. If you needed data out of gridconfigs we would use a left join where g.classID is null. instead of a not exists; but at a bit slower of a run. Also in does not process NULLS correctly whereas not exists will. https://stackoverflow.com/questions/12500646/which-is-faster-not-in-or-not-exists – xQbert Dec 09 '22 at 19:49
  • and read https://stackoverflow.com/questions/173041/not-in-vs-not-exists – xQbert Dec 09 '22 at 19:50
  • 1
    note: implicit casting is occuring to allow these joins to occur. If you want /need even better performance insure they are the same data types and both have indexes. – xQbert Dec 09 '22 at 20:03
  • Please provide `SHOW CREATE TABLE` so we can see the datatypes and indexes. – Rick James Dec 10 '22 at 07:19
  • I find perfect is the enemy of done. – xQbert Dec 10 '22 at 21:24

1 Answers1

1

Use Not exists... or if you need data from gridconfigs use a left join where g.classID is null

SELECT * 
FROM assets A 
WHERE a.type="folder" 
  AND  concat_ws("", a.path, a.filename) like "/product%" 
  AND NOT EXISTS 
      (SELECT 1 
       FROM gridconfigs G 
       WHERE A.ID = G.CLASSID 
         and G.type = 'asset' 
         and G.name = 'Photo Attributes')

As to why: Not exist and not in operate differently in the presence of NULL

Not exists can early exit once a single occurrence is found, in doesn't.

Read up:

Generally speaking: not in will not produce the desired results when Null is present; whereas not exists produces the desired results. They behave differently.

Also the concat_ws() means an index can't be used and will be slower. so you may be able to optomize this as well. But if it's good enough don't bother.

xQbert
  • 34,733
  • 2
  • 41
  • 62
  • Changing `G.CLASSID` to `0+G.CLASSID` _might_ speed it up further. (I don't know for sure.) Also `INDEX(classic, type, name)` (in any order) may help. – Rick James Dec 10 '22 at 07:23