2

i need optimize this query, please see the comented line:

SELECT p.NUM_PROCESSO,
       p.NUM_PROC_JUD,
       p.Num_Proc_Jud_Antigo1,
       p.Num_Proc_Jud_Antigo2,
       p.Num_Proc_Jud_Novo,
       a.assunto,
       su.subassunto,
       u.UNIDADE,
       s.SERVIDOR,
       dvj.data_vinc,
       p.TIPO,
       c.DESC_CLASSIF
FROM   processo p
       LEFT OUTER JOIN assunto a
         ON a.cod_assunto = p.cod_assunto
       LEFT OUTER JOIN subassunto su
         ON su.cod_subassunto = p.cod_subassunto
       LEFT OUTER JOIN Distrib_VincJud dvj
         ON dvj.num_processo = p.num_processo
       LEFT OUTER JOIN servidor s
         ON S.COD_SERVIDOR = dvj.COD_SERVIDOR
       LEFT OUTER JOIN unidade u
         ON u.COD_UNIDADE = s.COD_UNIDADE
       LEFT OUTER JOIN Classif_Processo c
         ON C.COD_CLASSIF = p.COD_CLASSIF
WHERE  p.TIPO = 'J'
       AND p.NUM_PROCESSO NOT IN (SELECT d.num_processo
                                  FROM   distribuicao d
                                  WHERE  d.COD_SERVIDOR in ( '0', '000' )
                                         AND d.num_distribuicao IN
                                             (SELECT MAX(num_distribuicao)
                                              FROM   Distribuicao
                                              GROUP  BY num_processo)
                                         --this suquery return 100k lines !!! and consume all CPU:
                                         AND dvj.id_vinc IN
                                                (SELECT MAX(id_vinc)
                                                FROM Distrib_VincJud
                                                where  ativo = '1'
                                                GROUP  BY num_processo))
       AND p.NUM_PROCESSO NOT IN (SELECT num_processo
                                  FROM   Anexos)
       AND s.ATIVO = 1  

my horrible solution at this moment: http://pastebin.com/C4PHNsSc

celsowm
  • 846
  • 9
  • 34
  • 59
  • 3
    What RDBMS are you using? [If MySQL avoid using `IN` for this.](http://stackoverflow.com/q/3417074/73226) – Martin Smith Sep 16 '11 at 21:19
  • 1
    How many rows do you expect that subquery to return? Have you run the subquery on its own and investigated the results? Unless I'm missing something obvious, it's going to be hard to help you without any information about your database schema or data. – Dan J Sep 16 '11 at 21:22
  • 1
    Optimize it for what? Speed? Readability? Your left outer joins are going to kill you.... – Christian Payne Sep 16 '11 at 21:25
  • 2
    Post the RDBMS and query execution plan. – Maxim Krizhanovsky Sep 16 '11 at 21:26
  • 3
    The `s.ATIVO = 1` condition makes many of the `LEFT JOIN`s pointless anyway. – Martin Smith Sep 16 '11 at 21:26
  • @celsowm - I see you've tagged this SQL Server now. You'll need to post the execution plan. [See this answer if not sure what that is](http://stackoverflow.com/questions/7359702/how-do-i-obtain-a-query-execution-plan/7359705#7359705) – Martin Smith Sep 16 '11 at 21:54
  • my horrible solution at this moment: http://pastebin.com/C4PHNsSc – celsowm Sep 16 '11 at 22:30
  • 1
    try `NOT EXISTS` instead of `NOT IN` – Magnus Sep 17 '11 at 00:00

3 Answers3

1

What I would do is convert the IN and NOT IN into joins:

SELECT p.NUM_PROCESSO, p.NUM_PROC_JUD, p.Num_Proc_Jud_Antigo1,
    p.Num_Proc_Jud_Antigo2, p.Num_Proc_Jud_Novo, a.assunto,
    su.subassunto, u.UNIDADE, s.SERVIDOR, dvj.data_vinc, p.TIPO,
    c.DESC_CLASSIF
FROM
    processo p
    INNER JOIN (
        SELECT p.num_processo,
            CASE WHEN dvj.id_vinc IS NOT NULL
                AND d.num_distribuicao IS NOT NULL
                OR a.num_processo IS NOT NULL THEN
                1
            ELSE
                0
            END exclude
        FROM
            processo p
            LEFT JOIN Anexos a
                ON p.num_processo = a.num_processo
            LEFT JOIN (
                SELECT num_processo,
                    MAX(num_distribuicao) AS max_distribuicao
                FROM Distribuicao
                GROUP BY num_processo
            ) md ON p.num_processo = md.num_processo
            LEFT JOIN (
                SELECT num_processo, MAX(id_vinc) AS max_vinc
                FROM Distrib_VincJud
                WHERE ativo = '1'
                GROUP BY num_processo
            ) mv on p.num_processo = mv.num_processo
            LEFT JOIN distribuicao d
                ON p.num_processo = d.num_processo
                    AND md.max_distribuicao = d.num_distribuicao
            LEFT JOIN Distrib_VincJud dvj
                ON p.num_processo = dvj.num_processo
                    AND mv.max_vinc = dvj.id_vinc
        WHERE d.COD_SERVIDOR in ('0', '000')
    ) IncExc
        ON p.num_processo = IncExc.num_processo
    LEFT OUTER JOIN assunto a
        ON a.cod_assunto = p.cod_assunto
    LEFT OUTER JOIN subassunto su
        ON su.cod_subassunto = p.cod_subassunto
    LEFT OUTER JOIN Distrib_VincJud dvj
        ON dvj.num_processo = p.num_processo
    LEFT OUTER JOIN servidor s
        ON S.COD_SERVIDOR = dvj.COD_SERVIDOR
    LEFT OUTER JOIN unidade u
        ON u.COD_UNIDADE = s.COD_UNIDADE
    LEFT OUTER JOIN Classif_Processo c
        ON C.COD_CLASSIF = p.COD_CLASSIF
WHERE
    p.TIPO = 'J'
    AND IncExc.exclude = 0
    AND s.ATIVO = 1
Xint0
  • 5,221
  • 2
  • 27
  • 29
  • it seems that the results came literally the opposite of what he wanted, which can help in some way – celsowm Sep 19 '11 at 15:58
  • If the results are exactly the opposite, then change the condition from `IncExc.exclude = 0` to `IncExc.exclude = 1` – Xint0 Sep 19 '11 at 22:06
0

This part

 AND p.NUM_PROCESSO NOT IN (

        SELECT d.num_processo FROM distribuicao d 
        WHERE d.COD_SERVIDOR in ('0','000') 
        AND d.num_distribuicao IN (
              SELECT MAX(num_distribuicao) FROM Distribuicao GROUP BY num_processo
        )

and this part

  AND p.NUM_PROCESSO NOT IN (

        SELECT num_processo FROM Anexos

  )

are going to be your biggest bottlenecks in the query as you've got nested subqueries in there.

You also have a few of these:

SELECT MAX(id_vinc) FROM Distrib_VincJud where ativo = '1' GROUP BY num_processo)
SELECT MAX(num_distribuicao) FROM Distribuicao GROUP BY num_processo

You might gain a few more seconds by letting these be seperate queries where you can store the results.

In fact, you might do well to have a separate table with these NOT IN(...) values that gets updated upon every insert to your database. It all depends on how often you run each query.

Have you tried running your Query Optimizer on these?

rlb.usa
  • 14,942
  • 16
  • 80
  • 128
0

Separate out the sub queries and then do a join

i.e. find all the num_processo that you are excluding in one query first. do a left join with the processo table on the num_processo field and exclude those where the first table's num_processo field is null

Edit: what's the relationship between the tables distribuicao and distrib_vincJud?

this line is killing your performance...

AND dvj.id_vinc IN
(   SELECT MAX(id_vinc)
    FROM Distrib_VincJud
    where  ativo = '1'
    GROUP  BY num_processo
)

sub query in a sub query which then references a joined table outside of the sub query??????

clyc
  • 2,420
  • 14
  • 15