3

If I execute this query:

SELECT * FROM table1 WHERE name LIKE '%girl%'

It returns all records where name contains 'girl'. However, because of the first wildcard % in the LIKE statment, it cannot (or does not) use indexes as stated here: Mysql Improve Search Performance with wildcards (%%)

Then I changed the query to:

SELECT * FROM table1 WHERE name LIKE 'girl%' OR name LIKE '%girl%'

On the leftside of the OR I removed the wildcard so it can use indexes. But the performance win depends on how MySQL evaluates the query.

Hence my question: Does the performance of my query increases when I add the OR statement?

Community
  • 1
  • 1

2 Answers2

3

No, the performance will be the same. MySQL still has to evaluate the first condition (LIKE '%girl%') because of the OR. Then it can evaluate the second condition using index. You can see this info when you EXPLAIN your query (mysql will show that it stills needs to do a full table scan, which means check each row):

EXPLAIN SELECT * FROM table1 WHERE name LIKE 'girl%' OR name LIKE '%girl%'

For better performance for these kinds of queries you would need to use Fulltext indexes and special syntax for querying them. But FT indexes behave different and are not suited for everything.

Marki555
  • 6,434
  • 3
  • 37
  • 59
  • EXPLAIN shows that adding or removing the - name LIKE 'girl%' - does not decrease the amount of rows. My conclusion is that @Marki555 is correct. – Sebastien Hoek Jul 14 '15 at 09:27
  • Just as an interesting point: searching for just `girl%` is on average 10x faster than `%girl` in my test cases. If you know which keywords are important to you(fixed keywords) you could make an index table that links prime searchwords to entries to make searching for prime keywords faster. – Tschallacka Jul 14 '15 at 09:31
  • 10x faster depends on the table size. With larger tables you will have 20x, 100x: you are comparing an index scan vs a full table scan – cristian v Jul 14 '15 at 09:48
1

(This answer provides a summary of the comments, plus contradicts some of the previous notes.)

Leading wildcard:

SELECT * FROM table1 WHERE name LIKE 'girl%' OR name LIKE '%girl%'
SELECT * FROM table1 WHERE                      name LIKE '%girl%'

Either of those will do a table scan and ignore any indexes. This both because of the leading wild card and the OR. (It will not use the index for 'girl%', contrary to what @Marki555 says -- it's not worth the extra effort.)

Range query via LIKE (no leading wildcard):

SELECT * FROM table1 WHERE name LIKE 'girl%'

will probably use INDEX(name) in the following way:

  1. Drill down the BTree for that index to the first name starting with "girl";
  2. Scan forward (in the index) until the last row starting with "girl";
  3. For each item in step 2, reach over into the data to get *.

Since Step 3 can be costly, the optimizer first estimates how many rows will need to be touched in Step 2. If more than 20% (approx) of the table, it will revert to a table scan. (Hence, my use of "probably".)

"Covering index":

SELECT name FROM table1 WHERE name LIKE '%girl%'

This will always use INDEX(name). That is because the index "covers". That is, all the columns in the SELECT are found in the INDEX. Since an INDEX looks and feels like a table, scanning the index is the best way to do the query. Since an index is usually smaller than the table, an index scan is usually faster than a table scan.

Here's a less obvious "covering index", but it applies only to InnoDB:

PRIMARY KEY(id)
INDEX(name)
SELECT id FROM table1 WHERE name LIKE '%girl%'

Every secondary key (name) in InnoDB implicitly includes the PK (id). Hence the index looks like (name, id). Hence all the columns in the SELECT are in the index. Hence it is a "covering index". Hence it will use the index and do an "index scan".

A "covering index" is indicated by Using index showing up in the EXPLAIN SELECT ....

Rick James
  • 135,179
  • 13
  • 127
  • 222