1

I'm building a query and the latest step involved adding a SELECT COUNT(*) FROM [modification] sub-clause to allow me to detect the last row of the query, but it destroys the performance:

SELECT CONCAT(
      IIF(row_number() OVER (ORDER BY forecastId) % 50000 = 1,
          CONCAT('INSERT INTO modification (userId, epochTime, ',
                 'forecastId, description, auxText, auxDate) VALUES ('), 
          '    ('),
      userId, ',',
      epochTime, ',',
      forecastId, ',',
      '''', dbo.encode4MySql(description), ''',',
      '''', dbo.encode4MySql(auxText), ''',',
      '''', CONVERT(VARCHAR(20), auxDate, 120), ''')',
      IIF(row_number() OVER (ORDER BY forecastId) % 50000 = 0
          OR row_number() OVER (ORDER BY forecastId) = 
                      (SELECT COUNT(*) FROM modification),
          '; COMMIT;', ','))
FROM modification
ORDER BY forecastId;

If you can't see what I'm doing, I'm building INSERT () VALUES (),(),(),... statements of 50000 rows at a time.

Please restrict suggestions for completely alternative approaches to the comments. I'm looking for a way to find the last row number here without it slowing the query down massively - which it does.

I'm not massively familiar with query plans but can post one here if it helps. I've tried a lot of things from related questions here, but nothing that I can get to work.

Adam
  • 5,215
  • 5
  • 51
  • 90
  • Yes please add the execution plan. As I imagine, your modification table is quite large. SELECT COUNT(*) implies that the whole table is read into memory. Maybe it's wide table so adding a narrow index can improve the performance(server will choose that one to read instead of the table to figure out the count) or maybe you'd better change the logic of your inserts, for what purpose do you need this "last row"? Maybe you can use another condition, for example, forecastId = max(forecastId), in this case it's enough to have an index on forecastId to reduce execution time – sepupic Sep 09 '17 at 18:10
  • Out of curiosity, where are you using the `insert` statement? I'm just interested to know which rdbms lets you use 50,000 rows per insert as SQL Server is limited to 1,000 row values per `insert ...values...` statement. – SqlZim Sep 10 '17 at 13:28
  • mysql - see https://stackoverflow.com/questions/3536103/mysql-how-many-rows-can-i-insert-in-one-single-insert-statement – Adam Sep 11 '17 at 10:18

3 Answers3

1

You have a rather complicated expression, so SQL Server may not optimize it. Move the logic to the FROM clause:

FROM (SELECT m.*, COUNT(*) OVER () as num_rows
      FROM modification m
     ) m

And then use num_rows in the rest of the query.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Kudos on the technique. It's not as fast as SqlZim's method, my query on 12M rows returns after 2:58, compared to 3:52 with yours. But it's a good technique, I can definitely see it has its uses. – Adam Sep 11 '17 at 09:22
1

One other option would be to order by forecastId desc in the final or:

IIF(row_number() OVER (ORDER BY forecastId) % 50000 = 0
          OR row_number() OVER (ORDER BY forecastId desc) = 1,
          '; COMMIT;', ','))
SqlZim
  • 37,248
  • 6
  • 41
  • 59
0

Assuming you don't want to change the current design, you could just add an extra UNION ALL step at the end. By looking at your query, it looks like the only purpose of changing the query is to add a COMMIT at the end.

CURRENT QUERY
UNION ALL
SELECT 'COMMIT;';

Let me know if that works for you.

***********UPDATE*********

I thought this query is easier to troubleshoot. See if it will perform any better. You would have to plugin the CTE part for your table.

SELECT BusinessEntityID,JobTitle,HireDate INTO dbo.TestTable FROM [HumanResources].[Employee]
SELECT TOP 0 BusinessEntityID,JobTitle,HireDate INTO dbo.TestTable2 FROM [HumanResources].[Employee]

SET NOCOUNT ON


WITH CTE AS (SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) Col, 
    ' (' + 
    CAST(BusinessEntityID AS VARCHAR(20)) + 
    ', ''' + 
    JobTitle + 
    ''', ''' + 
    CONVERT(VARCHAR(20), HireDate, 120) +
    ''')' Query
FROM TestTable
) 
    SELECT 
        CASE 
        WHEN COL % 50 = 0 THEN ', ' + Query + ' COMMIT;' 
        WHEN COL % 50 = 1 THEN ' INSERT INTO dbo.TestTable2 (BusinessEntityID, JobTitle, HireDate) VALUES ' + Query
        ELSE ', ' + Query 
        END 
    FROM CTE
UNION ALL SELECT 'COMMIT;'
Nayak
  • 473
  • 3
  • 10
  • Good idea but it doesn't work - you missed part of the complexity I need to cover where the final row needs bracketing off, but only if it's not `% 50000 = 0` – Adam Sep 11 '17 at 09:24