16

Below is a query I use for searching a person by email

  SELECT *
    FROM phppos_customers
    JOIN phppos_people ON phppos_customers.person_id = phppos_people.person_id
   WHERE deleted = 0
     AND email LIKE '%f%'
ORDER BY email ASC

Will adding an index on "email" speed up the query?

OMG Ponies
  • 325,700
  • 82
  • 523
  • 502
Chris Muench
  • 17,444
  • 70
  • 209
  • 362
  • 1
    [That explaination](http://use-the-index-luke.com/sql/where-clause/searching-for-ranges/like-performance-tuning) might help to understand why it's not working. – Markus Winand May 12 '11 at 05:42
  • 1
    [**This answer**](http://stackoverflow.com/a/22531268/793309) shows a nice technique - indexing all suffixes - that can make this kind of query perform very well, but at the cost of some extra coding and greater storage requirements. – antinome Dec 11 '14 at 15:26
  • Yes, this is possible; you can use an n-gram full-text index: https://stackoverflow.com/a/74956857/2299084 – flaviut Feb 24 '23 at 16:59

6 Answers6

31

No, because MySQL will not be able to utilize the index when you have a leading wildcard. If you changed your LIKE to 'f%', then it would be able to use the index.

Brad
  • 5,428
  • 1
  • 33
  • 56
10

Wildcarding the left side of a LIKE operation ensures that an index, if one exists on the email column, can not be used.

Full Text Search (FTS) is preferred syntax for finding strings within text via SQL. MySQL has native FTS functionality, using the MATCH/AGAINST syntax (Requires the table to use the MyISAM engine for v.5.5 and below. InnoDB FTS supported on v.5.6+):

  SELECT c.*, p.*
    FROM PHPPOS_CUSTOMERS c
    JOIN PHPPOS_PEOPLE p ON p.person_id = c..person_id
   WHERE deleted = 0
     AND MATCH(email) AGAINST('f')
ORDER BY email 

But there are third party FTS technology, such as Sphinx.

Joshua Huber
  • 3,443
  • 20
  • 27
OMG Ponies
  • 325,700
  • 82
  • 523
  • 502
  • I detailed the FULLTEXT and talked a bit about Sphynx here : http://stackoverflow.com/questions/3338889/how-to-find-similar-results-and-sort-by-similarity/3339034#3339034 – Yanick Rochon May 05 '11 at 23:43
  • As of MySQL 5.6 FTS functionality is now available on InnoDB tables. – blo0p3r Apr 07 '15 at 12:56
  • 2
    This is pretty useless, since FTS adds full words to index, whilst wildcard %% allows you to do partial search. There is ngram type of FTS index for MySQL, but it's quite useless to whilst trying to replicate: %% behaviour, since it doesn't support min/max ngram size (as e.g. ElasticSearch do), just the one `ngram_token_size`. – Tom Raganowicz Oct 08 '18 at 11:17
  • There is a way to shoehorn FTS into performing partial matches on both sides: https://stackoverflow.com/a/74956857/2299084 – flaviut Dec 29 '22 at 23:19
8

No, Mysql will not use the index because LIKE argument (%f%) starts with the wildcard character %. If it starts with a constant, index will be used.

More info: 7.5.3. How MySQL Uses Indexes

manji
  • 47,442
  • 5
  • 96
  • 103
4

In my post here I describe, in detail, a technique that allows you to use the index with LIKE for fast %infix% search, at the cost of some extra storage:

https://stackoverflow.com/a/22531268/543814

As long as the strings are relatively small, the storage requirement is generally acceptable.

According to Google, the average e-mail address is 25 characters long. This increases your required storage by a factor 12.5 on average, and gives you fast indexed search in return. (See my post for the calculations.)

From my perspective, if you are storing 10'000 e-mail addresses, you should be fine storing (the equivalent of) about 100'000 e-mail addresses, too. If this is what it takes to allow you to use an index, that seems like an acceptable trade-off. Often, disk space is cheap, while non-indexed searches are unaffordable.

If you choose to take this approach, I suggest that you limit the input length of e-mail addresses to 64 characters. Those rare (or attacker) e-mail addresses of such length will require up to 32 times the usual storage. This gives you:

  1. Protection against an attacker trying to flood your database, since these still aren't very impressive amounts of data.
  2. The expectation that most e-mail addresses are not of this length anyway.

If you consider 64 characters too harsh a requirement, use 255 instead, for a worst-case storage increase factor of 127.5. Ridiculous? Possibly. Likely? No. Fast? Very.

Community
  • 1
  • 1
Timo
  • 7,992
  • 4
  • 49
  • 67
2

You will not be able to make it faster with LIKE just like everyone says (about the % at the beginning), but you can improve it a little by joining after you filter your people first.

SELECT *
  FROM (SELECT * 
          FROM `phppos_customers`
         WHERE `deleted` = 0
           AND  `email`  LIKE '%f%') `t_customers`
  JOIN `phppos_people` ON `t_customers`.`person_id`=`phppos_people`.`person_id`
 ORDER BY `email` asc
Yanick Rochon
  • 51,409
  • 25
  • 133
  • 214
  • Using the `LIKE` with a left side wildcard in a derived table/inline view still won't use an index... – OMG Ponies May 05 '11 at 23:48
  • The OP asks specifically about index use... I'm inclined to downvote, being that you're aware you don't provide anything of value to the question... – OMG Ponies May 05 '11 at 23:55
  • 1
    I clearly answered "no" because of the `LIKE` (edited and clarified for your enjoyment), and provided an "improvement" on the query. Now, it may not answer the original question, but does provide helpful information about speed gain on a query. Now, what in that deserves a downvote? – Yanick Rochon May 06 '11 at 00:57
  • 3
    Your speculative belief that this is an improvement is not enough - explain plans matter. – OMG Ponies May 06 '11 at 03:33
-3

I know how to outsmart mysql and enable the index search even when wild card searching from the left side. Just create a reversed column of your column (make it an index), reverse also the search string, and use the wildcard from right side which has the index support..

So if you have in db word "slibro" and you want search "%libro", created reversed column will contain "orbils" and search will be "orbil%".

PS: But don't have the solution how to do fast full wildcard search "%x%" though :).

luky
  • 2,263
  • 3
  • 22
  • 40
  • This will not work as the first search will not find any valid results if the word to be found is there inside the string at any place other than the first one. PS: I tried to do this and it failed me miserably. – Ravish Kumar Aug 13 '22 at 07:24