1

I have a loop that checks all the names and dates from descript column from table tmp13 and stores them as individual rows in other table (tmp14). The problem is that the while loop is executing for a long time. I don't know how to make it run faster.

Here is my code checking the descript columns for names and date. Descript is a text column and can have multiple names and dates. I want to store those names and dates in separate rows.

DECLARE @Id INT
DECLARE @count INT
DECLARE @product_num INT
DECLARE @REQUESTED VARCHAR(50)
DECLARE @FirstDate VARCHAR(255)
DECLARE @RequestedBy VARCHAR(255)

DECLARE @name NVARCHAR(256)
DECLARE @date NVARCHAR(256)
DECLARE @desc NVARCHAR(256)

DECLARE @dateposition INT
DECLARE @nameposition INT
DECLARE @nameend INT

SELECT @count = MAX(id) 
FROM #TMP13

SET @id = 1;

WHILE (@id <= @count)
BEGIN
    SELCET @desc = descript FROM #TMP13 WHERE Id = @Id
    SELECT @product_num = p_Num FROM #TMP13 WHERE Id = @Id
    SELECT @REQUESTED = REQUESTED FROM #TMP13 WHERE Id = @Id
    SELECT @FirstDate = DATE1 FROM #TMP13 WHERE Id = @Id
    SELECT @RequestedBy = BY1 FROM #TMP13 WHERE Id = @Id


while (patindex('%[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9]%',@desc) > 0)
begin
    set @dateposition = patindex('%[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9]%',@desc)
    set @date = SUBSTRING(@desc,@dateposition,10)

    set @nameposition = CHARINDEX('-', @desc)+2
    set @nameend = CHARINDEX(' ', @desc, @nameposition)+1
    set @name = SUBSTRING(@desc,@nameposition,@nameend-@nameposition)

    insert into #TMP14 
    values (@Id,@product_num,@REQUESTED, @FirstDate ,@RequestedBY, @date, @name)

    set @desc = SUBSTRING(@desc,@nameend,1024)  
end

set @id = @id + 1;

end

select * from #tmp14;

---sample table

CREATE TABLE #Tmp13(
p_Num             INTEGER  NOT NULL PRIMARY KEY 
REQUESTED          varchar(50),
DATE1            VARCHAR(50),   
BY1              VARCHAR(50),
DESCRIPT         TEXT

);

INSERT INTO #tmp13( p_Num , REQUESTED, DATE1, BY1 , DESCRIPT) VALUES 
(100,'John','5/30/2017','James','05/30/2017 12:25am Eastern Standard Time - 
Mjames reported changes in the pages 05/30/2017 10:35AM JRachael agreed to 
work on the report and report to James 05/30/2017 10:00 AM James reports 
errors in page.',NULL);

INSERT INTO Table_Tasks(WO_NUM,Opendate,ClosedDate,Note) VALUES 
(200,'John','6/1/2017','Rachael','06/1/2017 3:20PM Eastern Standard Time -  
Rsubramaniam reported phone is not functional 06/1/2017 4:00PM Service took 
the phone and replaced it with new one');

OUTPUT

 Id  product_num REQUESTED FirstDate  RequestedBY date         name    date  
                                                                        Name 

 1   100          John      5/30/2017  james      5/30/2017 mjames  5/30/2017 jRachael
M.Doe
  • 99
  • 1
  • 10
  • What SQL are you using? MS MY other? – Brad Oct 15 '18 at 19:26
  • I am using Sql Server – M.Doe Oct 15 '18 at 19:29
  • 4
    The way to make this run faster is to stop using a loop when it isn't needed. This whole thing could be simplified to a single insert statement. – Sean Lange Oct 15 '18 at 19:30
  • 1
    You do not have a loop, you have nested loops. This is a recipe for disaster. Please [edit] your question with sample data as DDL+DML and expected outout, so that we can show you a better way to get there. – Zohar Peled Oct 15 '18 at 19:30
  • 1
    "Descript is a text column and individual cell can have multiple names and dates" - Yet another terrible problem. – Zohar Peled Oct 15 '18 at 19:34
  • I will edit the question and add the expected and table format – M.Doe Oct 15 '18 at 19:51
  • You probably need to use 'Set Based' thinking instead of 'Procedural'. This is a common problem for people who are somewhat new to SQL. – Sam Oct 15 '18 at 20:21
  • @Sam Are you sure? – paparazzo Oct 15 '18 at 23:16
  • I tried all the suggestion, and I am not very successful in getting the result. Can anyone please suggest some other way to tackle this issue. – M.Doe Oct 17 '18 at 13:51

3 Answers3

1

Convert the nested loops into two CROSS JOINED SELECTS and use it to insert values. The overall structure will be be something like this

INSERT INTO #TMP14 
    (Id, product_num, REQUESTED, FirstDate , RequestedBY, date, name)
SELECT ...
CROSS JOIN
SELECT ...

Now you have to turn the outer loop which generates a sequence of numbers into one of these SELECT statements, and the inner loop which results from a string split into the other SELECT statements.

This SO question has several answers to the first problem: How to generate a range of numbers between two numbers?.

This SO question has answers to the second problem: Turning a Comma Separated string into individual rows.

SQL-Server and database servers in general are highly optimized query engines, but are not very good in manipulating tables in loops.

Olivier Jacot-Descombes
  • 104,806
  • 13
  • 138
  • 188
1

While loops will be slow because that's what happens when you use procedural logic to develop in a declarative programming language. For this you can use patternSplitCM. There are issues with your sample data but this should get you what you need.

SELECT t.p_Num,
       t.REQUESTED,
       t.date1,
       t.BY1,
       parsedDate1 = MAX(CASE WHEN s.ItemNumber = 1 THEN s.Item END),
       parsedDate2 = MAX(CASE WHEN s.ItemNumber > 1 THEN s.Item END)
FROM   #Tmp13 AS t
CROSS 
APPLY    dbo.patternSplitCM(t.DESCRIPT, '[0-9/]') AS s
WHERE    s.[Matched] = 1 AND TRY_CAST(s.item AS DATE) IS NOT NULL
GROUP BY t.p_Num,t.REQUESTED, t.date1,t.BY1

Results:

p_Num       REQUESTED    date1        BY1       parsedDate1  parsedDate2
----------- ------------ ------------ --------- ------------ ------------
100         John         5/30/2017    James     05/30/2017   05/30/2017
200         John         6/1/2017     Rachael   06/1/2017    06/1/2017

Lastly, for optimal performance you want an index on the columns p_Num,REQUESTED,DATE1,BY1 to support the GROUP BY clause. The index would look something like:

CREATE UNIQUE NONCLUSTERED INDEX uq_xxx ON #Tmp13(p_Num,REQUESTED,DATE1,BY1) 
--INCLUDE (DESCRIPT); -- if you can change this to VARCHAR(8000) or VARCHAR(max)

Also note that this will likely run faster with a parallel execution plan. For that you can use TRACEFLAG 8649 or Adam Machanic's make_parallel().

Alan Burstein
  • 7,770
  • 1
  • 15
  • 18
  • Yeah I was trying to type fast and I might have missed something. This is really helpful, but I also want the name in the descript. How do I find that? – M.Doe Oct 15 '18 at 20:41
  • I adjusted the sample data table. Basically my main concern is when the descript reads 05/30/2017 12:25am Eastern Standard Time - Mjames I want the date = 05/30/2017 and name = Mjames stored – M.Doe Oct 15 '18 at 20:43
  • I tried to use this way, but I have multiple names and date for each cell. I have to find the dates manually this way since, I dont know how many dates I have. – M.Doe Oct 17 '18 at 13:38
-1

Some speed improvements set variables using single select instead of multiple select

SELCET @desc = descript,  @product_num = p_Num ,  @REQUESTED = REQUESTED  , @FirstDate = DATE1,  @RequestedBy = BY1  FROM #TMP13 WHERE Id = @Id

Remove Text data type (deprecated), use varchar(max)

CREATE TABLE #Tmp13(
p_Num             INTEGER  NOT NULL PRIMARY KEY 
REQUESTED          varchar(50),
DATE1            VARCHAR(50),   
BY1              VARCHAR(50),
DESCRIPT        varchar(max) --TEXT
);

Use varchar in place of nvarchar if possible.(@desc variable)

Do not repeat patindex call in loop

set @dateposition=999999--loop start
while (@dateposition > 0)
begin
    set @dateposition = patindex('%[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9]%',@desc)
    if(@dateposition>0)begin
    ....
    end
end
Xilmiki
  • 1,453
  • 15
  • 22