0

I'm having some behavior I've not experienced before with Mysql's full text indexing. The relevance for a search with cell comes back as 0 for all records. The in boolean result though verifies the term is present (and the where clause also).

select sum(char_length(concat(columns))), MATCH (columns) AGAINST ('+"cell"' IN BOOLEAN MODE), MATCH (columns) AGAINST ('"cell"'), id
from table
where MATCH (columns) AGAINST ('+"cell"' IN BOOLEAN MODE)

I checked the default stop word list, and verified I'm not using a custom list. What could be the reason for the 0 relevance and how can I correct it?

The content lengths vary greatly from 1717 characters up to 115905 so I wouldn't be surprised to see 4/115905 being ranked low (presuming it only occurred once). The 4/1717 I would expect to at least be a fraction though.. and this logic also assumes the term only appeared once. The term is present multiple times.

I read the manual entry as well and was thinking maybe this passage

Relevance is computed based on the number of words in the row (document), the number of unique words in the row, the total number of words in the collection, and the number of rows that contain a particular word.

and was thinking maybe because cell is present in all the articles it nullified the result but that didn't match up with other tests I ran. For example animal model brings back relevance as I'd expect.

My full text index length is 1 character.

The table with the index is:

ENGINE=MyISAM DEFAULT CHARSET=latin1 ROW_FORMAT=COMPRESSED
user3783243
  • 5,368
  • 5
  • 22
  • 41
  • 1
    You are using MyISAM? For MyISAM, [Search words that show up in more 50% or more of the rows, will be ignored.](https://stackoverflow.com/a/48431277) (in natural language mode) – Solarflare Mar 07 '19 at 14:12
  • @Solarflare Yes, the engine is `MyISAM` but that doesn't hold true for the `animal model` example. All the rows have that term... or am I misreading that? – user3783243 Mar 07 '19 at 14:17
  • 1
    Yes, I think you are misreading it. Terms that show up in >= 50% of rows will be ignored in natural language mode (so you get a score of 0/no search result if you would use that mode in `where`) but not in boolean mode (so you currently find them). – Solarflare Mar 07 '19 at 14:21

0 Answers0