3

I have a column which contains captions to images. What I want to do is find the most popular words occurring in that column.

Is there a way to do this with t-sql?

derekcohen
  • 1,514
  • 4
  • 17
  • 34
  • Especially in T-SQL? Standard SQL does not suffice? – jkj Jan 19 '12 at 06:56
  • Maybe this post (http://stackoverflow.com/questions/314824/t-sql-opposite-to-string-concatenation-how-to-split-string-into-multiple-reco) is a starting point? From there you can split your column contents and then maybe group? – Gorgsenegger Jan 19 '12 at 07:08
  • What RDBMS are you using? MS SQL 2008? – codingbadger Jan 19 '12 at 07:50

1 Answers1

0

I think you can do something like this. I don't think this is the most optimized, but you can improve on it.

Create a table variable @t to store the words Create a cursor, loop through each row in your captions table. for each row, use the chunk split function to get a table (http://www.sommarskog.se/arraylist-2005/tblnum-functions.html#chunk_split_me) add the words from the table to the table variable @t

Do a word and count(word) and order descending on @t .

Jason Dam
  • 390
  • 1
  • 8