I'm trying to create an inline table function to strip out HTML (so that I can CROSS APPLY
it with another table). We have a scalar function in place, but the performance is just not good enough.
I'm really new to using T-SQL, so I don't know that many good practices of it, so if anyone has any suggestions, please let me know!
This is what I have so far:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE FUNCTION StripHTML
(
@text VARCHAR(MAX)
)
RETURNS TABLE
AS
RETURN
WITH BracketIndexes (startIndex, stopIndex) AS (SELECT
CHARINDEX('<', @text),
CHARINDEX('>', @text) UNION ALL SELECT
CHARINDEX('<', @text, stopIndex + 1),
CHARINDEX('>', @text, stopIndex + 1)
FROM BracketIndexes
WHERE startIndex > 0 AND stopIndex > 0
)
SELECT * FROM BracketIndexes
Now that I have the indices of where the <
s and >
are, I'm not sure how to:
- remove the contents between those indices
- return the string so I can cross apply it
I'm thinking I can just concat the left side and right side of each bracket using LEFT
/RIGHT
, but I can't seem to get the indices correct.
Edit:
This kinda worked:
DECLARE @text VARCHAR(100) = 'HELLO <B>WORLD</B> asd';
WITH BracketIndexes (startIndex, stopIndex) AS (
SELECT
CHARINDEX('<', @text),
CHARINDEX('>', @text)
UNION ALL
SELECT
CHARINDEX('<', @text, stopIndex + 1) - (stopIndex - startIndex + 1),
CHARINDEX('>', @text, stopIndex + 1) - (stopIndex - startIndex + 1)
FROM BracketIndexes
WHERE startIndex > 0 AND stopIndex > 0
)
SELECT @text = LEFT(@text, startIndex - 1) + RIGHT(@text, LEN(@text) - stopIndex)FROM BracketIndexes WHERE startIndex > 0 AND stopIndex > 0
SELECT @text
But when when I added some more HTML statements into @text
, the RIGHT
had an incorrect length parameter.