MySql `MATCH AGAINST` and `LIKE` combination to search for special characters

Question

For example, in database i have row with phrase DN-NP. In input field i type DN- and want to find the row. Here is example http://sqlfiddle.com/#!9/f9235a/4 Tried to use FULLTEXT index and MATCH AGAINST

SELECT `name` 
FROM `domains` 
WHERE MATCH (`name`) AGAINST ('DN*' IN BOOLEAN MODE);

get no results.

Here https://dba.stackexchange.com/a/111895/55397 is advice to combine with LIKE. At the moment idea is in php something like

if( strlen($_POST['input_field_value']) <= 2 ){ 
$sql = 'SELECT `name` 
FROM `domains` 
WHERE MATCH (`name`) AGAINST ('DN*' IN BOOLEAN MODE)
OR `name` LIKE "%DN%"'; 
}

But LIKE "% %" too slow? Any ideas how to solve the problem (to find phrases that contains special characters)?

What about LOCATE (performance)?

SELECT name AS name7 
FROM domains 
WHERE LOCATE('DN',`name`)>0;

score 1 · Answer 1 · answered Nov 07 '21 at 05:41

1

how about regexp ?

SELECT name AS name6 FROM domains WHERE name regexp ('DN');

answered Nov 07 '21 at 05:41

Ali Im

25
7

score 1 · Accepted Answer · answered Nov 07 '21 at 16:52

1

Indexes can help with speed by limiting the number of rows to look at. Most code shown so far requires testing every row.

FULLTEXT is very good at finding rows when its rules apply. I doubt if +DN* applies due to word-length and existence of punctuation.
`LIKE "DN-NP%" can use an index very efficiently. But that only works for the string being at the start of the column.
`LIKE "%DN-NP%" -- The leading wildcard requires checking every row.
LOCATE and any other string operator -- not sargable, so needs to look at every row.
REGEXP "DN-NP" -- slower than LIKE. (There are other situations where REGEXPcan be faster and/orLIKE` won't apply.)

If you have the min word-length set to 2, then this trick may be the most efficient:

WHERE MATCH(col) AGAINST("+DN +NP" IN BOOLEAN MODE)
  AND col LIKE '%DN-NP%'

The MATCH will efficiently whittle down the number of rows; the LIKE will make further whittle down the number or rows, but only looking at the small number from the MATCH.

Caveat: Which of these do you need to match or not match?:

abc DN-NP def
abc DNs-NPed def   -- look likes "plural", etc which FULLTEXT matches
abc DN-NPQRS def   -- word boundary issue
abc ZYXDN-NP def

REGEXP can match a "word boundary"; LIKE does not have such.

Please build a list of things you want to match / not match. We might have a better answer for you.

answered Nov 07 '21 at 16:52

Rick James

135,179
13
127
222

Thank you for advice. When in input field i type `DN` i want to find in mysql rows that contains ... actually i want to find all rows that contains `DN` in any position (all four examples). If in input field i type `DN-`, i also want to find all mysql rows that contains `DN-` in any position. I mean, i want to find all rows that contain input string(characters) – Andris Nov 08 '21 at 03:56
Continued tests in http://sqlfiddle.com/#!9/28a3f3/9. Seems with `MATCH` problem is, if in input field i type `-` character, then `IN BOOLEAN MODE` it means `NOT`. Without `BOOLEAN MODE` get nothing. `LIKE` with `%...%` does exactly what i need. But as i understand is too slow if many rows. So question is how to decrease number of rows for `LIKE`. Interesting information here https://stackoverflow.com/a/41716705/2118559 "I would like to comment that surprisingly, creating an index also helped speed up queries for like '%abc%' queries in my case." – Andris Nov 08 '21 at 04:21
@Andris - `SHOW VARIABLES LIKE 'innodb_ft_min_token_size';` probably says 3, which means that `DN` is too short to be indexed. – Rick James Nov 08 '21 at 07:09
Yes, correct, minimum is 3. Got `[0] => Array ( [Variable_name] => innodb_ft_min_token_size [Value] => 3 )` – Andris Nov 08 '21 at 17:32
1

@Andris - If you change that setting to 2, you will need to recreate the index before it takes effect. – Rick James Nov 08 '21 at 18:19

MySql `MATCH AGAINST` and `LIKE` combination to search for special characters

2 Answers2

Linked