2

I have a simple MySQL table with 40000 rows:

id CHAR(36) (i am using UUIDs as primary keys)
number_from (INT)
number_to (INT)

with 3 indexes:

primary(id)
number_from(number_from)
number_to(number_to)

it looks like simple query like following takes fairly large amount of time (like 30 seconds or more):

SELECT * FROM numbers WHERE (number_from >= 703911711 AND number_to <= 703911800)

it there anything i can do to optimize performance?

Mureinik
  • 297,002
  • 52
  • 306
  • 350
misieg
  • 21
  • 2
  • Please post the output from `EXPLAIN SELECT * FROM.....` 30s on only 40k rows seems quite excessive even if no index was utilized. Also post the output from `SHOW CREATE TABLE numbers` so we can see the full definition in context. – Michael Berkowski Apr 22 '15 at 12:18
  • Don't SELECT * explicitly define field names instead – MuppetGrinder Apr 22 '15 at 12:21

4 Answers4

1

Having the indexes separated will force MySQL to scan them separately. If you index both number_from and number_to together, MySQL could perform the entire where clause on a single index:

CREATE INDEX number_from_number_to_ind
ON numbers (number_from, number_to)

Adding the id to the index will actually allow MySQL to perform the entire query without accessing the table, but at the expense of making the index significantly larger. You'd have to benchmark and see if this actually improves the query's performance or not:

CREATE INDEX all_columns_ind
ON numbers (number_from, number_to, id)
Mureinik
  • 297,002
  • 52
  • 306
  • 350
  • 1
    i dont think there's a need for that last index, at least not with innodb. the primary key is implicitly present in every secondary index – pala_ Apr 22 '15 at 12:51
0

You have two problems that are not easily solved.

  • UUIDs are terribly 'random',
  • The optimizer dos not understand that the from-to ranges do not overlap

But first, let's make sure you have the cache settings right. If you are using MyISAM, key_buffer_size should be about 20% of available RAM. Or for InnoDB, innodb_buffer_pool_size should be about 70% of available RAM. If you had small settings, this change would only move the problem from being I/O to being CPU. And it will not scale past when you cannot cache most of the table and its indexes in RAM. Reference

My blog explains why UUIDs are terrible for large indexes. That also provides a Stored Function to make the UUID smaller (BINARY(16)), hence more cacheable; this will help with the I/O.

If you have "type-1" UUIDs (such as generated by MySQL), the blog also explains why, with the help of the Function, use of UUIDs can be scaled. But if it is some other flavor of UUID, and you must have at least one BTree (INDEX or PRIMARY KEY) with the UUID in it, your task is not scalable. Can you turn it into an AUTO_INCREMENT? That would allow INSERTs to be clumped at the 'end' of the table rather than randomly scattered around the table, thereby blowing out cache?

As for the ranges, you need to revamp the table to put only one of from/to in each row. This involves extra rows for the 'unowned' ranges. Then write a stored routine to use LIMIT 1 to prevent what you are seeing -- namely failure to effectively use any index. Another blog goes into the details, together with schema design and reference code for IPv4 (which might be what you are doing) and IPv6. My code is very scalable and eliminates the indexing problem you posed.

Rick James
  • 135,179
  • 13
  • 127
  • 222
0

Conventional indexes don't work when the bounds of your range are stored in different attributes. As lots of people have discovered and documented, the solution is to use geospatial indexing. This was the first hit in google.

Since mysql geospatial indexing works in 2 dimensions, the trick is to map the range to one dimension and use a fixed scale in the other dimension.

symcbean
  • 47,736
  • 6
  • 59
  • 94
  • I was inspired by your answer and the linked blog post, so I tried to apply it to datetime columns. However, I am facing this [problem](http://stackoverflow.com/questions/43757499/mariadb-myisam-spatial-index-not-being-used) – architectonic May 03 '17 at 10:31
0

Old post, I know, but I like to keep StackOverflow a bit up-to-date if possible.
So, if you need a RANGE(from,to) kind of function, please note that nowadays MariaDB has an ENGINE called SEQUENCE that will do this for you:

SELECT `seq` FROM `seq_0_to_100`

as long as the from and to numbers are within the limits of BIGINT(20) UNSIGNED

Nibbik
  • 11
  • 2