4

I have a table called hotel with the following information:

  • Hotel_Id: 2950
  • Hotel_Name: Inn on the Park
  • Hotel_Number: 01234567
  • Hotel_TypeId: 1

I need to be able to search for records where the name column contains certain terms.

The search is:

select * 
from ContainsTable(hotel, Hotel_Name, '"Inn on Park"')

I get no results but if I search:

select * 
from ContainsTable(hotel, Hotel_Name, '"In on Park"')

I get

Key: 2950
Rank: 176

I figured there was some issue with the term "inn" but if I search for:

select * 
from ContainsTable(hotel, Hotel_Name, '"Inn"')

I get back the same key: 2950, Rank: 176 result.

Is "inn" a keyword that is causing this problem?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
GPW
  • 129
  • 1
  • 8
  • This is a bug in a live DB with loads of rows but I've created a tmp table and only have the one row in it which is "Inn on the Park" – GPW May 09 '15 at 00:26
  • What does the following query return: `select display_term from sys.dm_fts_parser('"Inn on Park"', your_lcid, 0, 0)` (you can try *your_lcid* with both 0 and your actual locale identifier)? – Mackan May 09 '15 at 08:29
  • For both 0 and 2057 (british english) lcid, I get back the same results... display_term: inn, on, park – GPW May 09 '15 at 12:58
  • 2
    Yeh, that was my bad - I was interested to know how your search would treat the `the` (if it was a _stop word/noise word_ or not), but forgot to add it in the query :) Anyway, since "In on Park" seemed to be working I guess I'm on the wrong track anyway. Perhaps redoing the fulltext index would help - not sure. – Mackan May 09 '15 at 13:21
  • Well, if I add "the" into the query, it is returned in the result list. The application sitting on top of this query does strip out noise words like "the" and "on" anyway. My real confusion is just on the "inn" not working but "in". I have rebuilt the index but it has no effect, I've also tried changing the locale of the index, again no luck. – GPW May 10 '15 at 16:39
  • I've tried changing the hotel name to: "Inn onn the park". And if I search for "onn the park", I get no results but if I search for "on the park" i do. So I guess it's, if a noise/key word has the extra character it becomes a problem? – GPW May 10 '15 at 17:10
  • It really shouldn't. But when you say "*The application sitting on top of this query does strip out noise words*", do you mean sql server, or some other application? – Mackan May 11 '15 at 09:37

1 Answers1

4

This is my theory..

Your hotel name, Inn on the Park, would be index like this:

pos   word
1     Inn
2     (noise)
3     (noise)
4     Park

on and the are stop/noise words, and not stored in the index (but notice that the position is still stored).

You're searching by a full string. Taking in to account the noise words this would mean:

Query 1: "Inn on Park" -> "Inn (noise) Park"
Query 2: "In on Park" -> "(noise) (noise) Park"

Your indexed string (hotel name) is Inn (noise) (noise) Park, so this would be a partial match on the second query, but no match on the first.

This can be tested by for example searching for Inn 1 1 Park. This will return a result. But Inn 1 Park or 1 Inn 1 Park will not (positions does not correspond).

To "fix" this you could use different operators, like AND, OR or NEAR:

"Inn" AND "Park"
"Inn*"
"Inn" NEAR "Park" 

Here are two screenshots, showing the results of your main queries. Notice how the second will only search for "Park", or "noise noise Park" (any of those would return results):

enter image description here

enter image description here

Mackan
  • 6,200
  • 2
  • 25
  • 45
  • OK, that makes sense. I didn't know the position of the noise word was important. In the real life scenario, there on and the are stripped out before it goes into the query so this would explain it. Is there anyway that this can be overcome or would I need to not strip out the noise words in the query? – GPW May 11 '15 at 13:49
  • 1
    @GPW I think it would be best to either separate each word between `AND`, like my example above, or send the whole search string to SQL and let it handle the _noises_. If you've already identified them in your front-end, you can just send any noise word (`a`) in the search term though. – Mackan May 11 '15 at 14:08
  • 1
    Thanks so much @Mackan. That was starting to get the better of me. I will just stop stripping out the noise words. – GPW May 11 '15 at 14:14
  • 1
    Beware searching with AND. If search text contains any noise-word, then containstable will return 0 results. This is stupid. You can re-configure SQL Server to change that behaviour. e.g. `sp_configure 'show advanced options', 1; RECONFIGURE; GO sp_configure 'transform noise words', 1; RECONFIGURE; GO ` See [link]https://technet.microsoft.com/en-us/library/ms187914.aspx – Chalky Oct 02 '16 at 23:42
  • 1
    +1 Thanks this really helped me with [my issue](https://stackoverflow.com/questions/48212372/sql-server-containstable-not-working-for-single-digit-numbers/48212373#48212373). – CodingYoshi Jan 11 '18 at 17:04