5

I have a table 'Foo' with two records like this:

int   | nvarchar(4000)
Col0  | Col1
------|--------------
00001 | 'Bar of SNAFU'
00002 | 'Bar SNAFU'

I want to return the record that has the 'of' separator, so I constructed a CONTAINS clause that looks like this:

SELECT *
FROM Foo
WHERE CONTAINS(*, '"Bar of*"')

0 Rows Returned

If I exchange it for a LIKE clause, it works just fine:

SELECT *
FROM Foo
WHERE Col1 LIKE 'Bar of%'

1 Row Returned

If I remove the second word from the search term in the CONTAINS clause, it returns both rows, as expected:

SELECT *
FROM Foo
WHERE CONTAINS(*, '"Bar *"')

2 Rows Returned

The problem only occurs once I include any variation of "o*" in the CONTAINS clause. I can use multiple words in the CONTAINS clause and retrieve the correct rows (verified with a separate query), so whitespace is not the issue, nor does it seem to occur with other letters. I have tried reformatting the CONTAINS to join the individual terms using the AND keyword ("Bar*" AND "of*"), but I got the same result. I need to return just the one row, and the table is rather large (hundreds of thousands of rows), so I would like to utilize the CONTAINS clause, rather than switch to a LIKE.

TT.
  • 15,774
  • 6
  • 47
  • 88
David Rahl
  • 51
  • 5
  • 2
    @a_horse_with_no_name - the only DBMS I can find with a `CONTAINS` function is SQL-Server. It looks like it does fuzzy/approximate/synonym matching. [Docs here](https://learn.microsoft.com/en-us/sql/t-sql/queries/contains-transact-sql?view=sql-server-ver15) Agreed, though, that OP needs to tag this question with the correct database engine. – Bob Jarvis - Слава Україні Dec 04 '19 at 14:45
  • 5
    A wild guess `of` being a stop word and omitted from the FTS index – peterm Dec 04 '19 at 14:51
  • 3
    You say you'd rather not use LIKE, but as long as you don't use a wildcard at the beginning of the search string, it should be OK and will use an index if there is one. See [here](https://stackoverflow.com/questions/7510646/like-vs-contains-on-sql-server/7510685) for other discussion on this. – BlueGI Dec 04 '19 at 14:56
  • @a_horse_with_no_name I had it tagged as sql-server, but I replaced that with the sql-server-2014 tag for better clarity. – David Rahl Dec 04 '19 at 15:24
  • @peterm I also thought that may be an issue, and I'm guessing it's probably the root cause, but I don't know how to get around that. Our clients need to be able to search for their particular products as accurately as possible. – David Rahl Dec 04 '19 at 15:24
  • @BlueGI This is a specific instruction from my supervisor. He is supposed to be pairing with me at some point to explain _why_, but for now, please assume a LIKE _cannot_ be used. – David Rahl Dec 04 '19 at 15:24
  • And if you try `WHERE CONTAINS(*, N'"Bar of*"')`? – TT. Dec 04 '19 at 16:30
  • @TT. Same result (0 rows returned). I think the issue has something to do with the "of" keyword being excluded from the FTS index, as peterm described, but I don't know how to prevent that or work around it to ensure we can include the word 'of' in the search. – David Rahl Dec 04 '19 at 16:37
  • 3
    Apparently it is possible to delete all stop words. Obviously there'll be a bloat tradeoff doing that. https://learn.microsoft.com/en-us/sql/relational-databases/search/configure-and-manage-stopwords-and-stoplists-for-full-text-search?view=sql-server-ver15 – EdmCoff Dec 04 '19 at 16:41
  • @EdmCoff Thank you for this information! This may end up being the route we have to take. If so, I'll let you know so you can post this as the answer. – David Rahl Dec 04 '19 at 16:49
  • 1
    @DavidRahl: you might want to have a look at [this question](https://stackoverflow.com/questions/59271411/sql-server-fulltext-contains-not-working-in-some-words/59271481#59271481), that was posted and answered today, and essentially relates to the same issue with stopwords in SQL Server full text search. – GMB Dec 10 '19 at 20:38

0 Answers0