0

I've recently started using fulltext indexes in a document searching system I'm developing with SQL Server 2008 R2. I've got the binary in one column and the stripped text from the PDF file in the other; the fulltext index is on the text column.

I've come up empty searching (probably b/c I don't know how to phrase the question), but I need to figure out how I can order the results by text occurrence. For example, if someone is searching for the string "book", I want to order the query based on how many times "book" occurs in the text.

Ryan
  • 127
  • 1
  • 10

1 Answers1

0

I would recommend using a 3rd party library to handle searching. A good search engine is a lot more than a string finder.

That being said, I understand it may still be useful to you to implement your text finder as you've specified. You can use a function to count the number of times the sub-string occurs.

I modified a custom Split function from Andy Robinson T-SQL split string

CREATE FUNCTION dbo.countOfSubString (@source VARCHAR(MAX), @stringToFind VARCHAR(MAX))
RETURNS INT

AS
BEGIN

 DECLARE @name NVARCHAR(255)
 DECLARE @pos INT
 DECLARE @count INT = 0

 WHILE CHARINDEX(@stringToFind, @source) > 0
 BEGIN
  SELECT @pos  = CHARINDEX(@stringToFind, @source)  
  SELECT @name = SUBSTRING(@source, 1, @pos-1)

  SET @count = @count + 1

  SELECT @source = SUBSTRING(@source, @pos+1, LEN(@source)-@pos)
 END

 RETURN @count
END

Then use it like this to find the page that has the most matches:

SELECT TOP 1 page_id
    , page_text 
FROM pages
ORDER BY dbo.countOfSubString(page_text, search_text) DESC
Jesse Potter
  • 827
  • 5
  • 20