I am working on a ticketing system where user can buy tickets for events. For these events info we are relying on different suppliers feed. Here the problem is each supplier would call each event in a different way but there will be some similarity.
So before inserting any event info into a table I need to find out are there any events with similar shows. If there are any similar events, we have some more filter mechanism where we would be checking the venue, dates and etc.
To find out the similar events at first step I am doing the following.
Splitting the new event name into string array (using cursor). Looping through every item and finding the events with similar name (using like). And inserting all those records into a temp table.
DECLARE @TmpTable TABLE (ProductHeaderID int, Name varchar(200))
DECLARE @TmpAddress TABLE (AddressId int)
DECLARE @Item Varchar(100)
DECLARE Items CURSOR LOCAL FOR select Item from dbo.SplitString(@ProductHeader,' ')
OPEN Items
FETCH NEXT FROM Items into @Item
WHILE @@FETCH_STATUS = 0
BEGIN
If dbo.Trim(@Item) <> '-' and dbo.Trim(@Item) <> ' ' and LEN(dbo.Trim(@Item)) > 3
Begin
--Print @Item
Insert into @TmpTable (ProductHeaderID,Name)
Select ProductHeaderID,Name from Product.ProductHeaderRepository
Where Name LIKE '%[^a-z]'+dbo.trim(@Item)+'[^a-z]%'
End
FETCH NEXT FROM Items into @Item
END
CLOSE Items
DEALLOCATE Items
SplitString is an user defined function which returns a table.
ALTER function [dbo].[SplitString]
(
@str nvarchar(max),
@separator char(1)
)
returns table
AS
return (
with tokens(p, a, b) AS (
select cast(1 as bigint), cast(1 as bigint), charindex(@separator, @str)
union all
select p + 1,b + 1, charindex(@separator, @str, b + 1) from tokens where b > 0
)
select p-1 ItemIndex,substring(@str, a, case when b > 0 then b-a ELSE LEN(@str) end) AS Item from tokens
);
And the Trim function is below
ALTER FUNCTION [dbo].[TRIM](@string VARCHAR(MAX))
RETURNS VARCHAR(MAX)
BEGIN
RETURN LTRIM(RTRIM(@string))
END
But performance wise above mentioned logic taking lot of time. Just for 4k records it is taking more than 10 minutes. Could somebody help me in optimizing the above.