Poor performance in tables join

Question

I'm experiencing very poor performance in a Oracle SQL query. The query is this:

 SELECT distinct idm3.cod
    FROM ibe_partos_m idm3, ibe_padron_prov_2010_m pad2
   WHERE     idm3.codLote = 1
         AND idm3.activo = 1
         AND ((pad2.ar = '2016' and pad2.mes='1') or (pad2.ar = '2015' and pad2.mes='7'))
         AND idm3.cod NOT IN                
           (SELECT idm2.cod
              FROM ibe_partos_m idm2,
                   ibe_padron_prov_2010_m pad
             WHERE     idm2.codLote = 1
                   AND idm2.activo = 1
                   AND ((pad.ar = '2016' and pad.mes='1') or (pad.ar = '2015' and pad.mes='7'))
                   AND pad.tiden != '2'
                   AND idm2.nombreM = pad.NOMB
                   AND idm2.apell1m = pad.APE1
                   AND idm2.apell2m = pad.APE2
                   AND (    idm2.numdocm = pad.IDEN || pad.LIDEN OR 
                                idm2.numdocm = pad.NDOCU OR 
                                idm2.numdocm = pad.LEXTR|| pad.IDEN|| pad.LIDEN OR 
                                idm2.numdocm = pad.LEXTR || '0' || pad.IDEN|| pad.LIDEN OR 
                                idm2.numdocm = pad.lextr || SUBSTR (pad.iden, 2, LENGTH (pad.iden))|| pad.liden)
                )                        
         AND idm3.PROREM = '07'
         AND idm3.nombreM = pad2.nomb
         AND idm3.apell1m = pad2.ape1
         AND idm3.apell2m = pad2.ape2
         AND (   (pad2.tiden = '1' AND pad2.liden IS NOT NULL)
              OR (    pad2.tiden = '3'
                  AND pad2.liden IS NOT NULL
                  AND pad2.lextr IS NOT NULL));

I've indexes defined in table ibe_partos_m, field codlote and field cod; and in table ibe_padron_prov_2010_m, fields ape1, ape2 and iden. All indexes are simple.

I can't understand why I'm having that poor performance...isn't the 2 indexes in ape1 and ape2 suffice for improving the join velocity?

Thank you in advance!!

Edit: what I'm trying to achieve is this:

Let's define the records that are correct as the records selected in the inner select.

I'm trying to get the records that are not correct in the former sense, but still fits into some properties, that are:

idm3.codLote = 1
     AND idm3.activo = 1
     AND ((pad2.ar = '2016' and pad2.mes='1') or (pad2.ar = '2015' and pad2.mes='7'))



AND idm3.PROREM = '07'
     AND idm3.nombreM = pad2.nomb
     AND idm3.apell1m = pad2.ape1
     AND idm3.apell2m = pad2.ape2
     AND (   (pad2.tiden = '1' AND pad2.liden IS NOT NULL)
          OR (    pad2.tiden = '3'
              AND pad2.liden IS NOT NULL
              AND pad2.lextr IS NOT NULL));

Edit2: As @Craig Young correctly suspects, I'm only interested (in both selects) in obtaining distinct cod...but is there some way to tell the DB server to stop searching for a given ibe_partos_m once a match is found in some or both selects?

Please explain what you are trying to do, I believe you can optimize this query a lot better. — sagi, Dec 13 '16 at 11:49
use group by instead of distinct, if you have Enterprise Edition than use parallel hint. Index is not a miraclemaker, if you have a 100 millions of row and your index consists a hundred "rows" for a lot of value, than that index will be a very expensive way — Thomas, Dec 13 '16 at 12:51
the select in here "AND idm3.cod NOT IN " : put that into the WITH clause — Thomas, Dec 13 '16 at 12:53
Alex, as I've already stated: We don't have the right kind of information (but you do). We don't know your schema. We don't know your data volumes or distribution stats. We can give you pointers, but you're going to have to do the testing. And as @Thomas pointed out: show the execution plan... It's perhaps the most useful tool in analysing query performance. — Disillusioned, Dec 14 '16 at 00:30
In a few days I hope to have the execution plan...the problem is my enterprise is a mess and The programmers like me don't have access to the DB admin tools and have to ask it all...i even cant create an index by myself, it's very frustrating... — Alex, Dec 14 '16 at 05:52

score 2 · Answer 1 · answered Dec 13 '16 at 12:16

Using JOIN makes the code easier to read, since it's self-explanatory.

LEFT JOIN is guaranteed to return every row from idm2

and adm2.cod is null;

select distinct idm3.cod
  from ibe_partos_m idm3

  inner join ibe_padron_prov_2010_m pad2
     on idm3.nombrem = pad2.nomb
     and idm3.apell1m = pad2.ape1
     and idm3.apell2m = pad2.ape2

  left join (select idm2.cod
          from ibe_partos_m idm2
          inner join ibe_padron_prov_2010_m pad
           on idm2.nombrem = pad.nomb
           and idm2.apell1m = pad.ape1
           and idm2.apell2m = pad.ape2  
         where idm2.codlote = 1
           and idm2.activo = 1
           and ((pad.ar = '2016' and pad.mes = '1') or
               (pad.ar = '2015' and pad.mes = '7'))
           and pad.tiden != '2'         
           and (idm2.numdocm = pad.iden || pad.liden or
               idm2.numdocm = pad.ndocu or
               idm2.numdocm = pad.lextr || pad.iden || pad.liden or
               idm2.numdocm = pad.lextr || '0' || pad.iden || pad.liden or
               idm2.numdocm =
               pad.lextr || substr(pad.iden, 2, length(pad.iden)) ||
               pad.liden))  idm2
   on idm3.cod = idm2.cod   

 where idm3.codlote = 1
   and idm3.activo = 1
   and ((pad2.ar = '2016' and pad2.mes = '1') or
       (pad2.ar = '2015' and pad2.mes = '7'))
   and idm3.prorem = '07'
   and ((pad2.tiden = '1' and pad2.liden is not null) or
       (pad2.tiden = '3' and pad2.liden is not null and
       pad2.lextr is not null))
    and adm2.cod is null;

Wouldn't` (pad2.ar,pad2.mes) IN(('2016','1'),('2015','7'))` have better performance than or? Same goes for the second `OR` : `idm2.numdocm IN(....)` — sagi, Dec 13 '16 at 12:21
@sagi I would expect Oracle's optimiser to generate the same query plan in both cases as they're logically equivalent. But it wouldn't hurt for OP to test both and compare results (something Leketo cannot do, without access to OP's system). — Disillusioned, Dec 13 '16 at 13:03
I think I tested it once and found that `IN()` performs better than `OR`s . @CraigYoung — sagi, Dec 13 '16 at 13:10
You are welcome to read that `IN()` performs better here -> http://stackoverflow.com/questions/782915/mysql-or-vs-in-performance , or here -> http://stackoverflow.com/questions/3074713/in-vs-or-in-the-sql-where-clause — sagi, Dec 13 '16 at 13:43

score 2 · Answer 2 · edited May 23 '17 at 12:00

Alarm bells are screaming at:

SELECT distinct idm3.cod

You're probably using DISTINCT to hide duplicates because of a joining error in your query. If you change that to the following you'll probably see that you're generating a huge number of rows only to hash out the duplicates:

SELECT COUNT(idm3.cod)

We don't know your schema, so we cannot point out your mistake. But you can start by checking appropriate filtering of the PK fields of each of your tables.

I also suggest follow Leketo's advice and use explicit JOIN syntax. It makes joining errors much easier to identify.

I also just noticed you're using many string-numbers.

If the underlying fields are char/varchar: your table design is inefficient. It can be significantly improved by changing those columns to appropriate numeric types.
If the underlying fields are already numeric types: using string-literal filters is probably resulting in non-sargable queries (i.e. you're unable to leverage indexes). Start using correct types and you may see significant improvement.

Poor performance in tables join

2 Answers2