4

I have a query that executes less time on dev server than on prod (database is the same). Prod server is much more efficient (64gb ram, 12 cores, etc).

Here's the query:

SELECT `u`.`id`,
       `u`.`user_login`,
       `u`.`last_name`,
       `u`.`first_name`,
       `r`.`referrals`,
       `pr`.`worker`,
       `rep`.`repurchase`
FROM `ci_users` `u`
LEFT JOIN
  (SELECT `referrer_id`,
          COUNT(user_id) referrals
   FROM ci_referrers
   GROUP BY referrer_id) AS `r` ON `r`.`referrer_id` = `u`.`id`
LEFT JOIN
  (SELECT `user_id`,
          `expire`,
          SUM(`quantity`) worker
   FROM ci_product_11111111111111111
   GROUP BY `user_id`) AS `pr` ON `pr`.`user_id` = `u`.`id`
AND (`pr`.`expire` > '2015-12-10 09:23:45'
     OR `pr`.`expire` IS NULL)
LEFT JOIN `ci_settings` `rep` ON `u`.`id` = `rep`.`id`
ORDER BY `id` ASC LIMIT 100,
                        150;

Having following explain result on dev server:

   +----+-------------+------------------------------+--------+---------------+-------------+---------+-----------+-------+---------------------------------+
| id | select_type | table                        | type   | possible_keys | key         | key_len | ref       | rows  | Extra                           |
+----+-------------+------------------------------+--------+---------------+-------------+---------+-----------+-------+---------------------------------+
|  1 | PRIMARY     | u                            | index  | NULL          | PRIMARY     | 4       | NULL      |     1 | NULL                            |
|  1 | PRIMARY     | <derived2>                   | ref    | <auto_key0>   | <auto_key0> | 5       | dev1.u.id |    10 | NULL                            |
|  1 | PRIMARY     | <derived3>                   | ref    | <auto_key1>   | <auto_key1> | 5       | dev1.u.id |    15 | Using where                     |
|  1 | PRIMARY     | rep                          | eq_ref | PRIMARY       | PRIMARY     | 4       | dev1.u.id |     1 | NULL                            |
|  3 | DERIVED     | ci_product_11111111111111111 | ALL    | NULL          | NULL        | NULL    | NULL      | 30296 | Using temporary; Using filesort |
|  2 | DERIVED     | ci_referrers                 | ALL    | NULL          | NULL        | NULL    | NULL      | 11503 | Using temporary; Using filesort |
+----+-------------+------------------------------+--------+---------------+-------------+---------+-----------+-------+---------------------------------+

And this one from prod:

+----+-------------+------------------------------+--------+---------------+---------+---------+--------------+-------+---------------------------------+
| id | select_type | table                        | type   | possible_keys | key     | key_len | ref          | rows  | Extra                           |
+----+-------------+------------------------------+--------+---------------+---------+---------+--------------+-------+---------------------------------+
|  1 | PRIMARY     | u                            | ALL    | NULL          | NULL    | NULL    | NULL         | 10990 |                                 |
|  1 | PRIMARY     | <derived2>                   | ALL    | NULL          | NULL    | NULL    | NULL         |  2628 |                                 |
|  1 | PRIMARY     | <derived3>                   | ALL    | NULL          | NULL    | NULL    | NULL         |  8830 |                                 |
|  1 | PRIMARY     | rep                          | eq_ref | PRIMARY       | PRIMARY | 4       | prod123.u.id |     1 |                                 |
|  3 | DERIVED     | ci_product_11111111111111111 | ALL    | NULL          | NULL    | NULL    | NULL         | 28427 | Using temporary; Using filesort |
|  2 | DERIVED     | ci_referrers                 | ALL    | NULL          | NULL    | NULL    | NULL         | 11837 | Using temporary; Using filesort |
+----+-------------+------------------------------+--------+---------------+---------+---------+--------------+-------+---------------------------------+

Profiling results on prod server shown me something like that:

............................................
| statistics                     | 0.000030 |
| preparing                      | 0.000026 |
| Creating tmp table             | 0.000037 |
| executing                      | 0.000008 |
| Copying to tmp table           | 5.170296 |
| Sorting result                 | 0.001223 |
| Sending data                   | 0.000133 |
| Waiting for query cache lock   | 0.000005 |
............................................

After googling a while I decided to move temporary tables into RAM:

/etc/fstab:

tmpfs /var/tmpfs tmpfs rw,uid=110,gid=115,size=16G,nr_inodes=10k,mode=0700 0 0

directory rules:

drwxrwxrwt  2 mysql mysql   40 Dec 15 13:57 tmpfs

/etc/mysql/my.cnf(played a lot with values):

[client]
port        = 3306
socket      = /var/run/mysqld/mysqld.sock

[mysqld_safe]
socket      = /var/run/mysqld/mysqld.sock
nice        = 0

[mysqld]
user        = mysql
pid-file    = /var/run/mysqld/mysqld.pid
socket      = /var/run/mysqld/mysqld.sock
port        = 3306
basedir     = /usr
datadir     = /var/lib/mysql
tmpdir      = /var/tmpfs
lc-messages-dir = /usr/share/mysql
skip-external-locking
bind-address        = 127.0.0.1
key_buffer      = 16000M
max_allowed_packet  = 16M
thread_stack        = 192K
thread_cache_size       = 150
myisam-recover         = BACKUP
tmp_table_size         = 512M
max_heap_table_size    = 1024M
max_connections        = 100000
table_cache            = 1024
innodb_thread_concurrency = 0
innodb_read_io_threads = 64
innodb_write_io_threads = 64
query_cache_limit   = 1000M
query_cache_size        = 10000M
log_error = /var/log/mysql/error.log
expire_logs_days    = 10
max_binlog_size         = 100M

[mysqldump]
quick
quote-names
max_allowed_packet  = 16M

[mysql]

[isamchk]
key_buffer      = 16M

And it doesn't work. Execution time still the same, around 5 sec. Can you please answer 2 questions:

  1. What's wrong with tmpfs configuration?
  2. Why explains are different on servers, how can I optimize this query? (even not using tmpfs; I figured out that if the last 'order by' removed, query completes much faster).

Thanks in advance.

Satevg
  • 1,601
  • 2
  • 16
  • 23
  • 1
    well, yeah, removing `order by` invariably does help on "large" queries. to sort the results, you have to build the entire result set first. taking away the ordering removes that need - you can start spitting out results as they're found. – Marc B Dec 16 '15 at 14:04
  • 1
    Explains show that on prod your query does not use indexes. Scanned row numbers are significantly higher on prod as a result. Have you compared the indexes on the underlying tables (esspecially on ci_users table) to confirm if they are the same? Did you try to use index hints on prod server if the inexes are the same to force the use of primary key on ci_users? – Shadow Dec 16 '15 at 14:08
  • @Shadow Results for "show index for table_name" are the same for all tables except 1 column: Cardinality column on prod server changes every time I run command! For ci_users table this value between 10,5k-11,5k. What can this mean? Can you please say more about indexes hints? (maybe as separate answer) – Satevg Dec 16 '15 at 14:36
  • 1
    mysql index hints: http://dev.mysql.com/doc/refman/5.7/en/index-hints.html - I would start with a force index primary on ci_users. – Shadow Dec 16 '15 at 14:50
  • What are the mysql versions of prod and dev servers? – Shadow Dec 16 '15 at 14:55
  • @Shadow Dev: debian 7, Mysql 5.6.28. Prod: debian 8, Mysql 5.5.44 I'm trying to apply index hints like - FROM `ci_users` force index (PRIMARY) - but getting syntax error all time (I'm slowpoke, will continue with that...) – Satevg Dec 16 '15 at 16:10
  • 1
    This subtle difference (v5.5 vs v5.6) may explain the speed difference, since mysql enhanced the subquery performance in v5.6. It can create indexes on subqueries from v5.6. You can see those indexes as in the explain from dev. In v5.5 this functionality does not exist. See http://mysqlserverteam.com/better-performance-for-joins-not-using-indexes/ for details – Shadow Dec 16 '15 at 16:14
  • @Shadow thanks! changing mysql version helped! Can you please create an answer, the same as a comment? I'll remove UPD* from question and accept your advice – Satevg Dec 17 '15 at 15:10

2 Answers2

2

Why explains are different on servers, how can I optimize this query? (even not using tmpfs; I figured out that if the last 'order by' removed, query completes much faster).

You say "database is the same", but from the explain outputs you presumably mean "the schema is the same". It looks like there is a lot more data in the production schema? MySQL optimises the way it handles queries based on the amount of data, index sizes, etc. That'll explain (at the highest level) why you're seeing such dramatic differences.

The column of your explain outputs to look at is "rows". Notice how the two derived tables were very small in dev? It looks like (you could ask in #mysql on freenode IRC to confirm) that MySQL was creating indexes for the derived tables in dev, but choosing not to in production (possibly because there were so many more records?).

What's wrong with tmpfs configuration?

Nothing. :) MySQL creates temporary tables in memory until the amount of data in them hits a certain size (tmp_table_size) before it writes temporary data to disk. You can trust MySQL to do this - you don't need to create all the complexity and overhead of creating a temporary filesystem in memory and pointing MySQL there... The key variable for InnoDB is innodb_buffer_pool_size, which I can't see you've tuned.

There's plenty of documentation online, including a lot of (IMHO) good stuff by Percona. (I'm not affiliated with them, but I have worked with them; if you can afford a support contract with them - do it. They really know their stuff.)

I'm absolutely no expert in tuning MySQL, so I'm not going to comment on the options you've selected, except to say that I've spent weeks before reading and tuning - just to have the Percona team look at it and say "That's great, but you've missed this and got that wrong" - and had a noticeable improvement as a result!

Finally I'd point at some other things - indexes, schema and queries being the major ones. You've got two subqueries, I'd try to factor those out to see if that helps first. You'll need a representative data sample available in dev to tune the query properly. (I've used a read-only replication server for this in the past.) I'm not fully understanding what your query is trying to do but it looks like you can just join those tables in and group the overall result.

If I'm missing the obvious (likely!) - then I'd consider maintaining a table of the data in those subqueries separately. I've always used SPs to handle INSERTs by default since a DBA pointed out you can more easily add such cache logic in at a later time in a transactionally safe manner. So when you insert into ci_* tables, also update a table of the COUNT() data (if you can't factor out the subqueries) - so everything becomes a well-indexed set of joins.

wally
  • 3,492
  • 25
  • 31
  • Much thanks for reply. No, data and schema is the same on these databases(restored from backup on prod. It's prod VPS in future, not in live right now as problem still not solved), around 11k users... Weird thing is that weaker server handles faster with the same dataset. Thank you again for the second part of answer. I will dive into these advices – Satevg Dec 16 '15 at 16:20
1

The explains show that on prod the query does not use indexes on u, derived1, derived2 tables, while on dev it does. Scanned row numbers are significantly higher on prod as a result. The index names on the 2 derived tables suggest that these have been created by mysql on the fly, taking advantage of materialized derived tables optimisation strategy, which is available from mysql v5.6.5. Since no such optimization is present in the explain from the prod server, prod server may have an earlier mysql version.

As @Satevg supplied in a comment, the dev and prod environments have the following mysql versions:

Dev: debian 7, Mysql 5.6.28. Prod: debian 8, Mysql 5.5.44

This subtle difference in mysql version may explain the speed difference, since the dev server can take advantage of the materialization optimization strategy, while the prod - being v5.5 only - cannot.

Shadow
  • 33,525
  • 10
  • 51
  • 64