This SQL gives me the blog comments that contain just the banned words defined in my table. I only get the EXACT matches and it removes duplicate rows. It also eliminates variants of a banned word. Which is what I want.
DECLARE @BlogComment TABLE (
BlogCommentId INT IDENTITY PRIMARY KEY,
BlogCommentContent VARCHAR(MAX),
Id int);
INSERT INTO @BlogComment
(BlogCommentContent,
Id)
VALUES
('There are many of us.' ,1),
('This is the man.', 2),
('I hear you.', 2),
('Your the man.',2);
DECLARE @BannedWords TABLE (
BannedWordsId INT IDENTITY PRIMARY KEY,
Word varchar(250));
INSERT INTO @BannedWords (Word) VALUES
('though'),
('man'),
('hear');
;WITH rs AS
(
SELECT word = REPLACE(REPLACE([value],'.',''),',','')
,Id
FROM @BlogComment
CROSS APPLY STRING_SPLIT(BlogCommentContent, SPACE(1))
)
SELECT DISTINCT bw.Word,
rs.id
FROM rs
INNER JOIN @BannedWords bw ON rs.word = bw.Word;
Results of running this are:
Word id
hear 2
man 2
What I expect.
Now I want to take it 1 step further. Test case: I have more than 1 banned word in the same blog comment.
So I altered the code (the table values) to include the test case. A blog comment with 2 banned words.
('He is the man. I hear ya.',2),
I want only 1 row returned for this case. Either one.
Word id
hear 2
And altered the code to accommodate this by adding 2 more lines of code per the 'accepted answer' from - Get top 1 row of each group
,ROW_NUMBER() OVER(PARTITION by Id ORDER BY BlogCommentContent) AS rn
WHERE rn = 1;
DECLARE @BlogComment TABLE (
BlogCommentId INT IDENTITY PRIMARY KEY,
BlogCommentContent VARCHAR(MAX),
Id int);
INSERT INTO @BlogComment
(BlogCommentContent,
Id)
VALUES
('There are many of us.',1),
('He is the man. I hear ya.',2),
('Your the man.',2);
DECLARE @BannedWords TABLE (
BannedWordsId INT IDENTITY PRIMARY KEY,
Word varchar(250));
INSERT INTO @BannedWords (Word) VALUES
('though'),
('man'),
('hear');
;WITH rs AS
(
SELECT word = REPLACE(REPLACE([value],'.',''),',','')
,Id
,ROW_NUMBER() OVER(PARTITION by Id ORDER BY BlogCommentContent) AS rn
FROM @BlogComment
CROSS APPLY STRING_SPLIT(BlogCommentContent, SPACE(1))
)
SELECT DISTINCT bw.Word,
rs.id
FROM rs
INNER JOIN @BannedWords bw ON rs.word = bw.Word
WHERE rn = 1;
Results of running this are no rows returned:
Word id
So, not sure why the 'accepted answer' does not work for me.