2

I have a table that contains messages taken for our clients and I basically need to find the top 50 list of the most commonly used words.

The table is called Messages, the typed message is Message. Effectively I want to SELECT all rows, recursively split each Message by space to get all words, GROUP BY each word and COUNT them. I'm struggling on recursively splitting by a space - any ideas?

eth0
  • 4,977
  • 3
  • 34
  • 48

1 Answers1

3
DECLARE @Table TABLE (userID INT,MESSAGE NVARCHAR(MAX))
INSERT INTO @Table
SELECT '1','word1 word2 word3' UNION ALL
SELECT '1','word2 word5 word4' UNION ALL
SELECT '2','word5 word2 word1' UNION ALL
SELECT '2','word1 word2 word3'

; WITH c AS(
SELECT N.Value,COUNT(N.Value) Cnt FROM 
@Table T
CROSS APPLY
dbo.split(T.message,' ') N
GROUP BY N.Value
)
SELECT  c.Value,c.Cnt
FROM    c
ORDER BY Cnt DESC

Result:

enter image description here

Note: you need split function which you can find in numerous examples in Internet. As exapmple, see here: Split function equivalent in T-SQL?

Community
  • 1
  • 1
Dalex
  • 3,585
  • 19
  • 25