1

I have been reading a lot of Query Optimizations and I have been capable of optimizing most of them.

Nevertheless, I have a very complex query. It creates accumulated values for my accounting accounts. The query is taking more than 10 minutes to run, and I think that it should be a better way to optimize it, but I am not figuring out it.

The code that I want to optimize is this:

SELECT Empresa, IDCuenta, Año, Periodo, Saldo, 

((SELECT SUM(Saldo) 
FROM
(SELECT Empresa, IDCuenta, ReferenciaOrden,  SUM(Saldo) As Saldo
    FROM                                            
        (SELECT     Empresa, IDCuenta, ReferenciaOrden, SUM(Saldo) As Saldo 
        FROM dbo.GP_ContabilidadTrxActivas
        WHERE FechaTransacción<=GETDATE()
        GROUP BY    Empresa, IDCuenta, ReferenciaOrden
        UNION ALL
         SELECT     Empresa, IDCuenta, ReferenciaOrden, SUM(Saldo) As Saldo 
          FROM dbo.GP_ContabilidadTrxHistoricas
         WHERE FechaTransacción<=GETDATE()
         GROUP BY    Empresa, IDCuenta, ReferenciaOrden
         ) As Base
        GROUP BY Empresa, IDCuenta, ReferenciaOrden) As BaseInt
WHERE BaseInt.IDCuenta=BaseTotal.IDCuenta AND BaseInt.Empresa = BaseTotal.Empresa 
AND BaseInt.ReferenciaOrden<=BaseTotal.ReferenciaOrden
)) As SaldoAcumulado
FROM
(SELECT Empresa, IDCuenta, Año, Periodo, ReferenciaOrden, SUM(Saldo) As Saldo
                        FROM                                            
                          (SELECT     Empresa, IDCuenta, Año, Periodo, ReferenciaOrden, SUM(Saldo) As Saldo 
                          FROM dbo.GP_ContabilidadTrxActivas WITH (INDEX(IX_ReferenciaOrden)
                          WHERE FechaTransacción<=GETDATE()
                          GROUP BY    Empresa, IDCuenta, Año, Periodo,ReferenciaOrden
                           UNION ALL
                           SELECT     Empresa, IDCuenta, Año, Periodo, ReferenciaOrden, SUM(Saldo) As Saldo 
                           FROM dbo.GP_ContabilidadTrxHistoricas WITH (INDEX(IX_ReferenciaOrden)
                           WHERE FechaTransacción<=GETDATE()
                           GROUP BY    Empresa, IDCuenta, Año, Periodo,ReferenciaOrden
                           ) As Base
GROUP BY Empresa, IDCuenta, Año, Periodo, ReferenciaOrden) As BaseTotal 

The indexes that I created for this query are:

CREATE NONCLUSTERED INDEX IX_ReferenciaOrden
ON dbo.GP_ContabilidadTrxHistoricas (IDCuenta ASC, ReferenciaOrden ASC)
INCLUDE (Empresa,  Año, Periodo, Saldo, FechaTransacción);


CREATE NONCLUSTERED INDEX IX_ReferenciaOrden
ON dbo.GP_ContabilidadTrxActivas (IDCuenta ASC, ReferenciaOrden ASC)
INCLUDE (Empresa,  Año, Periodo, Saldo, FechaTransacción);

The execution plans shows that 87% of the cost are in 3 activities: Index Seek, Stream Aggregate and Merge Join, this is the image of the partial estimated execution plan:

enter image description here

dbo.GP_ContabilidadTrxHistoricas has 3.559.617 rows and dbo.GP_ContabilidadTrxActivas has 102.707 rows

Any advise to optimize it would be more than welcome. Thanks in advance.

GriGrim
  • 2,891
  • 1
  • 19
  • 33
MariPlaza
  • 357
  • 2
  • 5
  • 16
  • Have you also tried not using the index HINTS? Does this give the same execution plan? Maybe get the UNION ALL query in a cte and use it afterwards? You seem to use that query twice, is that correct? – NickyvV Jan 17 '14 at 13:27
  • You were missing ) after query hints. –  Jan 17 '14 at 13:30
  • @SaUce the query runs ok, I just miss typed in the question here. – MariPlaza Jan 17 '14 at 13:34
  • @NickyvV yes I used twice because I need to do the accumulated amounts. I did it without the index HINTS but give me a more costly execution plan. I would try the CTE to check if works better. Thanks! – MariPlaza Jan 17 '14 at 13:38

3 Answers3

2

If you have SQL Server 2005 or higher you can try this:

DECLARE @tempTable TABLE (Empresa VARCHAR(100), IDCuenta INT, Año INT, Periodo INT, ReferenciaOrden INT, Saldo MONEY)

INSERT INTO @tempTable (Empresa, IDCuenta, Año, Periodo, ReferenciaOrden, Saldo)
SELECT Empresa, IDCuenta, Año, Periodo, ReferenciaOrden, SUM(Saldo) AS Saldo
FROM (
    SELECT Empresa, IDCuenta, Año, Periodo, ReferenciaOrden, Saldo
    FROM dbo.GP_ContabilidadTrxActivas
    WHERE FechaTransacción <= GETDATE()
    UNION ALL
    SELECT Empresa, IDCuenta, Año, Periodo, ReferenciaOrden, Saldo
    FROM dbo.GP_ContabilidadTrxHistoricas
    WHERE FechaTransacción <= GETDATE()
) AS Base
GROUP BY Empresa, IDCuenta, Año, Periodo, ReferenciaOrden

SELECT Empresa, IDCuenta, Año, Periodo, Saldo
    , (
        SELECT SUM(Saldo)
        FROM @tempTable AS BaseInt
        WHERE BaseInt.IDCuenta = BaseTotal.IDCuenta
            AND BaseInt.Empresa = BaseTotal.Empresa
            AND BaseInt.ReferenciaOrden <= BaseTotal.ReferenciaOrden
    ) AS SaldoAcumulado
FROM @tempTable AS BaseTotal

And also maybe creating index that contains FechaTransacción field can help. Because you filter tables by it.

GriGrim
  • 2,891
  • 1
  • 19
  • 33
  • Thanks, I tried the query and even is more optimal, it does not give me the right result because the SaldoAcumulado is resulting in the same value as Saldo. If I eliminate the ReferenciaOrden I get the value of the last month and not the accumulated until the specific period. Any other idea? – MariPlaza Jan 17 '14 at 14:04
  • Sorry. I have read your query in a bit wrong way. Which version of SQL Server do you have? It matters. – GriGrim Jan 17 '14 at 14:14
  • SQL Server 2008 R2 Thanks! – MariPlaza Jan 17 '14 at 14:15
  • See updated answer. In SQL Server 2008 there is no other way to calculate running total. Just try to use table variable. – GriGrim Jan 17 '14 at 14:34
2

First, seeing the "INCLUDE" as part of your index confused me as never seeing that so I looked into it and found an excellent explanation/answer in this post. The important note as that the INCLUDE should be on fields that are NOT part of things like a group by. Your query DEFINITELY uses the columns as part of the group by and SHOULD be part of the normal covering index for query optimization.

Second, what is probably killing your time is that you are doing a correlated query for your column Saldo for every returned record in the baseline query thus killing performance running every time repeatedly. I would restructure your query to have the main FROM clause as those two queries run ONCE EACH and JOIN them on the columns respectively. It appears that for each deeper level item, you also want the parent level aggregation total. For example all sales within a given region is one column, but also including the total for comparison to the ENTIRE region. I may be incorrect, but that is what it appears to be.

So, I would just create your index as the following keys on each current and history transaction tables. The first 3 columns are specifically this order to match your higher level aggregations so THAT TOO is optimized without going to the granular level of Ano, Periodo, FechaTransaccion.

( Empresa, IdCuenta, ReferenciaOrden, Ano, Periodo, FechaTransaccion ) include ( saldo )

SELECT 
      BaseTotal.Empresa, 
      BaseTotal.IDCuenta, 
      BaseTotal.Año, 
      BaseTotal.Periodo, 
      BaseTotal.Saldo, 
      SUM( BaseInt.Saldo ) as OrdenSaldo
   FROM
      ( SELECT 
              Empresa,
              IDCuenta,
              ReferenciaOrden,
              Año,
              Periodo,
              SUM(Saldo) As Saldo
           FROM
              ( SELECT
                      Empresa,
                      IDCuenta,
                      ReferenciaOrden,
                      Año,
                      Periodo,
                      SUM(Saldo) As Saldo 
                   FROM 
                      dbo.GP_ContabilidadTrxActivas
                   WHERE 
                      FechaTransacción <= GETDATE()
                   GROUP BY
                      Empresa,
                      IDCuenta,
                      ReferenciaOrden,
                      Año,
                      Periodo
                UNION ALL
                SELECT
                      Empresa,
                      IDCuenta,
                      Año,
                      Periodo,
                      ReferenciaOrden,
                      SUM(Saldo) As Saldo 
                   FROM 
                      dbo.GP_ContabilidadTrxHistoricas
                   WHERE 
                      FechaTransacción <= GETDATE()
                   GROUP BY
                      Empresa,
                      IDCuenta,
                      ReferenciaOrden,
                      Año,
                      Periodo ) As Base
           GROUP BY 
              Empresa,
              IDCuenta,
              ReferenciaOrden,
              Año,
              Periodo ) As BaseTotal 
      JOIN
      ( SELECT 
              Empresa, 
              IDCuenta, 
              ReferenciaOrden,  
              SUM(Saldo) As Saldo
           FROM
              ( SELECT
                      Empresa,
                      IDCuenta,
                      ReferenciaOrden,
                      SUM(Saldo) As Saldo 
                   FROM 
                      dbo.GP_ContabilidadTrxActivas
                   WHERE 
                      FechaTransacción <= GETDATE()
                   GROUP BY
                      Empresa, 
                      IDCuenta, 
                      ReferenciaOrden
                UNION ALL
                SELECT
                      Empresa, 
                      IDCuenta, 
                      ReferenciaOrden, 
                      SUM(Saldo) As Saldo 
                   FROM 
                      dbo.GP_ContabilidadTrxHistoricas
                   WHERE 
                      FechaTransacción <= GETDATE()
                   GROUP BY
                      Empresa, 
                      IDCuenta, 
                      ReferenciaOrden ) As Base
           GROUP BY 
              Empresa, 
              IDCuenta, 
              ReferenciaOrden ) As BaseInt
         ON BaseTotal.Empresa = BaseInt.Empresa
         AND BaseTotal.IDCuenta = BaseInt.IDCuenta
         AND BaseInt.ReferenciaOrden <= BaseTotal.ReferenciaOrden 
   GROUP BY
      BaseTotal.Empresa, 
      BaseTotal.IDCuenta, 
      BaseTotal.Año, 
      BaseTotal.Periodo, 
      BaseTotal.Saldo, 
   ORDER BY
      BaseTotal.Empresa, 
      BaseTotal.IDCuenta, 
      BaseTotal.Año, 
      BaseTotal.Periodo
Community
  • 1
  • 1
DRapp
  • 47,638
  • 12
  • 72
  • 142
  • It does not work because is not following the business rule. Remember that I need to get the running total, I did this change: BaseTotal.ReferenciaOrden = BaseInt.IDCuenta for BaseTotal.ReferenciaOrden <= BaseInt..ReferenciaOrden but the results are not what I expected. Thanks! – MariPlaza Jan 17 '14 at 14:42
  • @MariPlaza, then could you please show at the end of your post some sample of what you EXPECT the query to show with your "running total" column as you mentioned. Things like that in this and future posts will often lead to better initial answers by myself and others. ALSO... I modified query to hopefully handle your roll-up values. Please review it too. – DRapp Jan 17 '14 at 14:51
  • @DRappp OK, thanks for the feedback, I would include it in the future. I review your query and try to modify it to get the data that I was expecting using your technique but it did not work. I have researched a lot about running totals, and it is not a simple query to do and not all the technique retrieve it correctly. Thanks for your help. – MariPlaza Jan 17 '14 at 16:35
1

You are filtering by date, I would suggest you create your indexes like this

CREATE NONCLUSTERED INDEX IX_ReferenciaOrden
ON dbo.GP_ContabilidadTrxHistoricas (FechaTransacción)

If that doesn't help you, try to add the columns as they are in GROUP By clause. This way the index is sorted in the same way as required by GROUp BY

CREATE NONCLUSTERED INDEX IX_ReferenciaOrden
ON dbo.GP_ContabilidadTrxHistoricas (FechaTransacción, Empresa, IDCuenta, Año, Periodo,ReferenciaOrden)

If you still think this is slow, create the covering index with columns from select clause, this way the clustered index doesn't need to be accessed at all

CREATE NONCLUSTERED INDEX IX_ReferenciaOrden
ON dbo.GP_ContabilidadTrxHistoricas (FechaTransacción, Empresa, IDCuenta, Año, Periodo,ReferenciaOrden)
INCLUDE(Saldo)

You can also try to reformat the query using CTEs

TomT
  • 971
  • 7
  • 13
  • Thanks, I create the third index, it improve slightly, it is around 8 minutes now. I try the CTEs but it did not make any difference, just improve the order of the query. – MariPlaza Jan 17 '14 at 14:39