2

I've written a job request system in PHP with a MySQL database and I'm having an issue with a slow query.

My schema (simplified) is as follows:

tbl_job
job_id
job_desc requester_user_id

tbl_user
user_id
user_name

tbl_workermap
workermap_id
job_id
worker_user_id

A table containing the jobs, a user table for the possible workers and one to map workers to jobs. A job can have one or more workers, a worker can have one or more jobs.

tbl_user contains both users who request work, and those that work on the jobs, so user IDs are stored under worker_user_id in tbl_workermap and requester_user_id in tbl_job

When a job is logged it creates an entry in tbl_job but nothing in tbl_workermap until someone specifically assigns a worker. This means that when I query the jobs I do it with a left join as there are not entries in tbl_workermap for every job:

SELECT 
job.job_id,
job.job_desc,
workermap.worker_user_id,
worker.worker_name

FROM tbl_job AS job

LEFT JOIN tbl_workermap AS workermap
ON job.job_id = workermap.job_id

LEFT JOIN tbl_user AS worker
ON workermap.worker_user_id = worker.user_id

The system has been in use for a while and I now have about 8000 entries in tbl_job and 7000 in tbl_workermap and it's taking over 4 seconds to retrieve all results. An EXPLAIN query shows the tbl_workermap join returning around 7000 rows and "Using where; Using join buffer (Block Nested Loop)".

Is there anything I can do to speed this up?

EDIT: add table info
I'd simplified things to explain but here's the actual table structure. There are more joins but tbl_workermap is the only problematic one:

CREATE TABLE `tbl_job` (
  `job_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `job_title` varchar(100) DEFAULT NULL,
  `job_description` text,
  `job_added_datetime` int(11) DEFAULT '0',
  `job_due_datetime` int(11) NOT NULL DEFAULT '0',
  `job_time_estimate` int(11) DEFAULT NULL,
  `job_additional_fields` text,
  `addedby_user_id` int(11) NOT NULL DEFAULT '0',
  `requester_user_id` int(11) NOT NULL DEFAULT '0',
  `worker_user_id` int(11) NOT NULL DEFAULT '0',
  `job_active` tinyint(4) NOT NULL DEFAULT '1',
  `site_id` tinyint(4) NOT NULL DEFAULT '1',
  `status_id` int(11) NOT NULL DEFAULT '1',
  `estimategroup_id` int(11) DEFAULT '1',
  `brand_id` int(11) DEFAULT '1',
  `job_isproject` tinyint(4) DEFAULT NULL,
  PRIMARY KEY (`job_id`),
  FULLTEXT KEY `job_title` (`job_title`,`job_description`,`job_additional_fields`)
) ENGINE=MyISAM AUTO_INCREMENT=8285 DEFAULT CHARSET=latin1



CREATE TABLE `tbl_user` (
  `user_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `user_shortname` varchar(30) DEFAULT NULL,
  `user_name` varchar(30) DEFAULT NULL,
  `user_password` varchar(50) DEFAULT NULL,
  `user_password_reset_uuid` varchar(50) DEFAULT NULL,
  `user_email` varchar(50) DEFAULT NULL,
  `user_description` text,
  `user_sortorder` int(11) NOT NULL DEFAULT '0',
  `user_isworker` tinyint(4) NOT NULL DEFAULT '0',
  `user_active` tinyint(4) NOT NULL DEFAULT '1',
  `site_id` tinyint(4) NOT NULL DEFAULT '0',
  `user_avatar_file_id` int(11) DEFAULT NULL,
  `user_avatar_hub_url` varchar(100) DEFAULT NULL,
  PRIMARY KEY (`user_id`)
) ENGINE=MyISAM AUTO_INCREMENT=917 DEFAULT CHARSET=latin1


CREATE TABLE `tbl_workermap` (
  `workermap_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `job_id` int(11) DEFAULT NULL,
  `workermap_datetime_added` int(11) DEFAULT NULL,
  `workermap_datetime_removed` int(11) DEFAULT NULL,
  `worker_user_id` int(11) DEFAULT NULL,
  `addedby_user_id` int(11) DEFAULT NULL,
  `removedby_user_id` int(11) DEFAULT NULL,
  `site_id` int(11) DEFAULT NULL,
  `workermap_isassigned` int(11) DEFAULT NULL,
  `workermap_active` int(11) NOT NULL DEFAULT '1',
  PRIMARY KEY (`workermap_id`)
) ENGINE=InnoDB AUTO_INCREMENT=7145 DEFAULT CHARSET=latin1

SHOW INDEX

+---------+---+-----------+---+-----------------------+------+------+------+------+-----+----------+--+--+
| tbl_job | 0 |  PRIMARY  | 1 |        job_id         |  A   | 8283 | NULL | NULL |     |  BTREE   |  |  |
+---------+---+-----------+---+-----------------------+------+------+------+------+-----+----------+--+--+
| tbl_job | 1 | job_title | 1 | job_title             | NULL |    1 | NULL | NULL | YES | FULLTEXT |  |  |
| tbl_job | 1 | job_title | 2 | job_description       | NULL |    1 | NULL | NULL | YES | FULLTEXT |  |  |
| tbl_job | 1 | job_title | 3 | job_additional_fields | NULL |    1 | NULL | NULL | YES | FULLTEXT |  |  |
+---------+---+-----------+---+-----------------------+------+------+------+------+-----+----------+--+--+

+----------+---+---------+---+---------+---+-----+------+------+--+-------+--+--+
| tbl_user | 0 | PRIMARY | 1 | user_id | A | 910 | NULL | NULL |  | BTREE |  |  |
+----------+---+---------+---+---------+---+-----+------+------+--+-------+--+--+

+---------------+---+---------+---+--------------+---+------+------+------+--+-------+--+--+
| tbl_workermap | 0 | PRIMARY | 1 | workermap_id | A | 7184 | NULL | NULL |  | BTREE |  |  |
+---------------+---+---------+---+--------------+---+------+------+------+--+-------+--+--+

EXPLAIN query

+---+--------+----------------+--------+---------+---------+------+-------------------------------+------+----------------------------------------------------+
| 1 | SIMPLE |      job       |  ALL   |  NULL   |  NULL   | NULL |             NULL              | 8283 |    Using where; Using temporary; Using filesort    |
+---+--------+----------------+--------+---------+---------+------+-------------------------------+------+----------------------------------------------------+
| 1 | SIMPLE | estimategroup  | eq_ref | PRIMARY | PRIMARY | 4    | jobq.job.estimategroup_id     |    1 | Using where                                        |
| 1 | SIMPLE | brand          | eq_ref | PRIMARY | PRIMARY | 4    | jobq.job.brand_id             |    1 | Using index condition                              |
| 1 | SIMPLE | site           | eq_ref | PRIMARY | PRIMARY | 4    | jobq.job.site_id              |    1 | Using where                                        |
| 1 | SIMPLE | addedby        | eq_ref | PRIMARY | PRIMARY | 4    | jobq.job.addedby_user_id      |    1 | Using index condition                              |
| 1 | SIMPLE | requester      | eq_ref | PRIMARY | PRIMARY | 4    | jobq.job.requester_user_id    |    1 | Using index condition                              |
| 1 | SIMPLE | worker         | eq_ref | PRIMARY | PRIMARY | 4    | jobq.job.worker_user_id       |    1 | Using index condition                              |
| 1 | SIMPLE | status         | ALL    | PRIMARY | NULL    | NULL | NULL                          |    6 | Using where; Using join buffer (Block Nested Loop) |
| 1 | SIMPLE | workermap      | ALL    | NULL    | NULL    | NULL | NULL                          | 7184 | Using where; Using join buffer (Block Nested Loop) |
| 1 | SIMPLE | user_workermap | eq_ref | PRIMARY | PRIMARY | 4    | jobq.workermap.worker_user_id |    1 | Using where                                        |
| 1 | SIMPLE | categorymap    | ALL    | NULL    | NULL    | NULL | NULL                          |    1 | Using where; Using join buffer (Block Nested Loop) |
| 1 | SIMPLE | category       | eq_ref | PRIMARY | PRIMARY | 4    | jobq.categorymap.category_id  |    1 | Using where                                        |
+---+--------+----------------+--------+---------+---------+------+-------------------------------+------+----------------------------------------------------+

Adrian
  • 35
  • 4
  • 2
    Please post TEXT results of SHOW CREATE TABLE tbl_job; and tbl_worker; and tbl_workermap; as well as SHOW INDEX FROM tbl_job; and tbl_worker; and tbl_workermap; for analysis. Also, post the EXPLAIN .... results, please. – Wilson Hauck Jan 17 '19 at 19:46
  • 1
    @WilsonHauck I've added that info, thank you. – Adrian Jan 18 '19 at 11:19
  • For the moment, you need two indexes to cover the basic rule of BOTH left and right objects of a JOIN = need an index. 1. ALTER TABLE tbl_workermap ADD INDEX idx_t_w_map_job_id (job_id) 2. ALTER TABLE tbl_workermap ADD INDEX idx_t_w_map_wrk_user_id (worker_user_id) after creation run EXPLAIN ..... to see new execution plan – Wilson Hauck Jan 18 '19 at 13:43
  • 1
    @WilsonHauck Thank you, I've added those two indexes and that's done the trick, it's now near instant. If you can add that info as an answer I can mark it as accepted so you get the credit. – Adrian Jan 21 '19 at 17:17
  • 1
    @WilsonHauck I'd be happy to do that, and have just done so :) Thanks again. – Adrian Jan 23 '19 at 10:41

3 Answers3

1

Jan 18, 2019 at 13:43 For the moment, you need two indexes to cover the basic rule of BOTH left and right objects of a JOIN = need an index. 1. ALTER TABLE tbl_workermap ADD INDEX idx_t_w_map_job_id (job_id) 2. ALTER TABLE tbl_workermap ADD INDEX idx_t_w_map_wrk_user_id (worker_user_id) after creation run EXPLAIN ..... to see new execution plan.

Wilson Hauck
  • 2,094
  • 1
  • 11
  • 19
  • 1
    This solved the issue. I added indexes on tbl_workermap for `job_id` and `worker_user_id` and re-ran the EXPLAIN query. It no longer retrieves 7000+ rows from tbl_workermap and the query is now near instant. Thank you! – Adrian Jan 23 '19 at 10:30
0

If you have not already, create a clustered index by setting a primary key (assuming the tables are properly normalized). (If you haven't, you will likely want to setup foreign key constraints as well.)

If this question were involved Microsoft SQL Server, I would recommend creating a stored procedure, especially if this is a query run frequently as some sort of regular process. As noted in this answer, however, primary performance benefit for simple queries like this will be from table design and indices.

Paul
  • 3,634
  • 1
  • 18
  • 23
  • 1
    Thanks Paul. I am using primary keys, I've edited my post to show how these are set up. – Adrian Jan 18 '19 at 11:21
  • @Adrian, it looks like you have a much more interesting problem than I had thought/hoped. Regarding my stored procedure comment, I was mistaken. If you were using SQL Server, a stored procedure would probably save you some headaches, but it appears that is not the case with MySQL. – Paul Jan 18 '19 at 16:08
0

"A job can have one or more workers." What about vice versa (a worker can be on multiple jobs)? If not, then you have only 1:many, and you should not implement it with that extra table.

Assuming it really is many:many, there here are several tips on optimizing that table.

Don't use LEFT unless you are expecting the "right" table to be missing the desired row.

Style tip: Get rid of tbl_, user_ (except for user_id), etc. That is prefixes on names are clutter and redundant with the context. Be consistent on "user" versus "worker".

Have the many:many table named with both targets (eg, worker_job). However, I now see that it is more than a simple many:many mapping table; it more like a table for assigning and tracking who is working on what over time?

If you need both a history of who worked on what and the current state of who is working on what, consider having two tables. The history keeps growing; the 'current' keeps changing.

Use suitable datatypes, such as DATE and DATETIME.

Use InnoDB instead of MyISAM.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • 1
    Thanks Rick, yes a worker can have multiple jobs so it is many:many. A job may have one, multiple or no workers assigned so there may or may not be an entry in tbl_workermap for any given job. I've edited my original post to clarify. I'll have a read of that link now. – Adrian Jan 18 '19 at 11:23
  • @Adrian - I made several added comments. – Rick James Jan 18 '19 at 20:30
  • 1
    Thanks for the tips, the reason for both `user_id` and `worker_user_id` is that the users table contains both workers and other users. Someone assigning a job may not necessarily be a worker. I've solved the performance issue using WilsonHauck's advice above, but the two tables you suggest sounds like a good solution if this becomes an issue as the database size grows. – Adrian Jan 21 '19 at 17:28