0

I am running two queries that are supposed to do the same thing on our CV database table:

  1. CONTAINS (full-text indexed table - cv, fullname etc are all included in the index)

     SELECT COUNT(1) 
     FROM [CVDatabase] 
     WHERE CONTAINS ((CVText, fullname, city, CompanyName, jobTitle, targetJobTitle),'"cost accountant*"') AND (Country='South Africa')
    

    Number of results: 300

  2. CHARINDEX (unindexed)

    SELECT COUNT(1) 
    FROM [CVDatabase] 
    WHERE (CHARINDEX ('cost accountant', cvtext) > 0 OR 
           CHARINDEX ('cost accountant', fullname) > 0 OR 
           CHARINDEX ('cost accountant', city) > 0 OR
           CHARINDEX ('cost accountant', companyname) > 0 OR
           CHARINDEX ('cost accountant', jobtitle) > 0 OR
           CHARINDEX ('cost accountant', targetjobtitle) > 0) 
      AND (Country = 'South Africa')
    

    Number of results: 339

This is over a 10% discrepancy. Naturally CHARINDEX with capture 'cost accountant', 'cost accountants' (which CONTAINS will also capture with the wildcard). The difference might come from characters before 'cost accountant' which CHARINDEX will capture but not CONTAINS - but I cannot think of any characters that would make sense before the word 'cost'.

Is it possible there is something wrong with the indexing, and it is not picking up all the results? It is set to track automatic changes so it should be indexing correctly.

1 Answers1

1
  1. Are all those rows indexed?

  2. SQL FT will split the text into words so it will not find "xyzcost accountant", but that's probably ok. Still, what happens if you search for "cost accountant" (i.e. without the wildcard"?

  3. Here are some details on how to check if the indexing process has completed. Note that if it hasn't your queries still run but produce inconsistent results (i.e less results). This is because the designers of FT want to enable the system to update and insert new data without downtime but a little data loss may happen.

  4. The examples on msdn do not show how to combine phrase search with an wildcard. Maybe you could use "NEAR" instead?

Community
  • 1
  • 1
Bogdan Gavril MSFT
  • 20,615
  • 10
  • 53
  • 74