1

I have a USERSEARCH table that should be used for fast substring searches for users. This feature is for an autocomplete search that occurs while someone is typing for a username or name. However, the query I am interested in will only show matches from users the person follows.

USERSEARCH
-----------------------------------------------
user_id(FK)    username_ngram          name_ngram
1              "AleBoy leBoy eBoy..."  "Ale le e"
2              "craze123 raze123 ..."  "Craze raze aze ze e"
3              "john1990 ohn1990 ..."  "John ohn hn n"
4              "JJ_1 J_1 _1 1"         "JJ"


USERRELATIONSHIP
-----------------------------------------------
user_id(FK)    follows_id(FK)
2              1
2              3

The following query is performed as someone has just typed "Al":

SELECT * FROM rage.usersearch where username_ngram like 'Al%' --1
UNION DISTINCT
SELECT * FROM rage.usersearch where name_ngram like 'Al%'  --2
UNION DISTINCT
SELECT * FROM rage.usersearch                             --3
WHERE MATCH (username_ngram, name_ngram) AGAINST ('Al')  
LIMIT 10

Indices

index(user_id)
index(username_ngram)
index(name_ngram)
FULLTEXT(username_ngram, name_ngram)

Is there a way to restrict the above query to only look at this subset of user_ids (without querying for them 3 times for each subquery)?

SELECT follows_id FROM rage.userrelationship WHERE user_id={user_id of user doing the searching} 
Rage
  • 870
  • 9
  • 27
  • Fulltext index doesn't work with LIKE. Search for "UNION optimization". – Paul Spiegel Jun 19 '20 at 20:06
  • @PaulSpiegel What about normal indices like the two I mentioned? I know they help with LIKE {string}% – Rage Jun 19 '20 at 20:11
  • This might help: [sql-performance-union-vs-or](https://stackoverflow.com/questions/13750475/sql-performance-union-vs-or) – Paul Spiegel Jun 19 '20 at 20:12
  • `LIKE '{string}%'` is a range search. MySQL cannot perform two (efficient) range searches on one index. – Paul Spiegel Jun 19 '20 at 20:14
  • This LIMIT is acting on the full UNION set. See [UNION Clause](https://dev.mysql.com/doc/refman/8.0/en/union.html) "To apply an ORDER BY or LIMIT clause to an individual SELECT, parenthesize the SELECT and place the clause inside the parentheses". Which you didn't - so it acts on the full set. – Paul Spiegel Jun 19 '20 at 21:40
  • But at the same time ;-) the engine might be smart enough to skip the execution of other UNION parts when it finds the first row. – Paul Spiegel Jun 19 '20 at 21:45
  • @PaulSpiegel I have updated the question, please take a look – Rage Jun 19 '20 at 22:17
  • This now seems to be a quite different question. Please don't use SO this way. Consider to ask a new question with an MCVE. But one thing I can already tell you: This might work this LIKE but not with FULLTEXT. – Paul Spiegel Jun 19 '20 at 22:44
  • You are right, https://stackoverflow.com/questions/62479503/how-to-avoid-re-querying-a-temporary-query-in-mysql – Rage Jun 19 '20 at 22:46

2 Answers2

1

MySQL can use only one INDEX per table reference. It also can use only one range scan per index. So neither two separate indices on the two column, nor a composite index on both columns would prevent a full table scan. And a FULLTEXT index doesn't work with LIKE. Best you can do to optimize this query is to combine two separate searches in a UNION query:

SELECT user_id FROM myapp.usersearch WHERE username_ngram LIKE '{string}%'
UNION DISTINCT
SELECT user_id FROM myapp.usersearch WHERE name_ngram LIKE '{string}%'

The engine can now use INDEX(username_ngram) for the first query part and INDEX(name_ngram) for the second.

Paul Spiegel
  • 30,925
  • 5
  • 44
  • 53
  • That makes more sense, thank you for linking the article. The query above would be fast because it would make use of: index(username_ngram) and index(name_ngram) correct? – Rage Jun 19 '20 at 20:31
  • Right, - The individual parts will use one index each. – Paul Spiegel Jun 19 '20 at 20:32
  • What if the table had a FULLTEXT(username_ngram, name_ngram) index and the query was: SELECT id, username, name FROM table WHERE MATCH (username, name) AGAINST ({string}); – Rage Jun 19 '20 at 21:06
  • Fulltext index is an index of full words, so it can only find full words and can't be used for autocompletion. E.g. it will not find "Johnson" when you type in "John". – Paul Spiegel Jun 19 '20 at 21:09
  • True, but look at my ngram columns, they exploit this. – Rage Jun 19 '20 at 21:09
  • I didn't realize that - interesting approach. As I understand - you are creating your ohn kind of substring index. Maybe you've been too fast accepting my answer :-) – Paul Spiegel Jun 19 '20 at 21:21
  • I have adjusted my question, please take a look – Rage Jun 19 '20 at 21:34
0

I don't see the utility of the ngram stuff if you will be receiving the first few letters of the string.

At that point, this is optimal:

SELECT ... WHERE name LIKE 'Al%'
    LIMIT 10;

with INDEX(name).

If you need to use a UNION ALL together with LIMIT, then do this:

( SELECT ... ORDER BY .. LIMIT 10 )
UNION ALL
( SELECT ... ORDER BY .. LIMIT 10 )
ORDER BY .. LIMIT 10

If you don't repeated the LIMIT each of the subqueries will gather all the relevant rows, thereby creating a temp table that is larger than necessary.

If you will be using OFFSET (perhaps not for this app), see this for how to make that work: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#or

Rick James
  • 135,179
  • 13
  • 127
  • 222