2

My MySQL table is not returning results with a MATCH (col) AGAINST ('') query.

The table is simple:

id | url | fullTextIndex

And my query is

SELECT *, Match(fullTextIndex) AGAINST ("7f7f7f807f8080807f8080807f7f7f807c828888808a86967e8b858d7f89838a76829e958f7badb68084a3a38384899077848b877f799f9c85799fa2827d8c8a ") FROM Pictures;

The last column, the match, is always 0. Except, I know for a fact that the string above is contained, verbatim, in one of the values.

Things to note:

  • The string is only in that row (so it is not in more than 50% of rows, so it shouldn't be ignored).
  • This is not the Full value
  • The column is a bigText column
  • When I use INSTR, I get the value 1 (which is correct)

Any ideas why this query might not be working?

LoveAndCoding
  • 7,857
  • 2
  • 31
  • 55

1 Answers1

4

There seems to be a (configurable) upper limitation on the length of the words considered for indexation:

http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html#sysvar_ft_max_word_len

You can check the current value with SHOW VARIABLES LIKE "ft_max_word_len";

It returns 84 on my server, and your string is 128 chars long.

Suggested fix:

  1. Add this line to your my.cnf file: ft_max_word_len=128 (or whatever max length you need)

  2. Rebuild your indexes as advised on the MySQL website: REPAIR TABLE tbl_name QUICK;

BenMorel
  • 34,448
  • 50
  • 182
  • 322
  • Is there any other solution for getting this to work? That is the issue I believe, but my host doesn't allow me to modify those variables. :( – LoveAndCoding Dec 11 '11 at 01:22
  • I'm afraid not: `ft_max_word_len` is not a dynamic variable, and therefore must be specified in the server configuration! Maybe a change in your application could work around this limitation? – BenMorel Dec 11 '11 at 02:50
  • 1
    Unfortunately that won't work either. That's a fingerprint that we're using for image comparison, we can't really make it smaller, or break it up. I'll just have to setup a different server for it. Thanks :) – LoveAndCoding Dec 11 '11 at 03:12
  • 1
    Seems like you can't increase the `ft_max_word_len` above 84 (http://serverfault.com/q/339859/91709). Still marked best, but this is as a note for anyone else who finds this question. – LoveAndCoding Dec 11 '11 at 09:23
  • TBH, I wouldn't use a fulltext index for this kind of hash matching, this is not what they've been designed for. I would setup another table with a one-to-many relationship to your current table, containing an indexed `BINARY(64)` field (you can insert/select records with `UNHEX()` and `HEX()` respectively), and perform lookups on this separate table. Lookups will be lightning fast and will work with no further configuration! – BenMorel Dec 11 '11 at 13:06
  • The real problem is I will be doing many to many matching. This is just an isolated example. I need to do a many to many matching of these strings, given and stored in a particular order, and will be searching with anywhere between ~40 to 140+ of these strings at any given time. – LoveAndCoding Dec 11 '11 at 18:19
  • Maybe you could try something like `SELECT imageId FROM fingerprints WHERE fingerprint IN(?, ?, ?) GROUP BY imageId HAVING COUNT(imageId) = 3` (adapting the number of items to each query, `3` in this example); that would return the IDs of images having all of these fingerprints (untested, just a thought) – BenMorel Dec 11 '11 at 18:33
  • That, that will probably be what we need to do, but the problem with that is we won't get the advantage of vector distances in the matching. The real issue become inserting in, as it gets to be either a lot of queries or a long query. Ah well, Thanks for your help – LoveAndCoding Dec 11 '11 at 18:39