Increase performance of a one-to-many join

Question

We're building a model where we're joining a 13-part profile (01_01_resource_utilization_prepared) to a daily record to create 13 record per day; this is a deliberate one-to-many which grows the size of the table.

It is a simple query but we have tried indexing but what is the best way to optimise this query?

SELECT
a.DATE,
a.RUN_ID,
a.HOURS,
a.HOURS * b.RESOURCE_DISTRIBUTION,
a.SCHEDULE_PROFILE_ID,
a.WEEKDAY_NUMBER,
a.SCHEDULE_DISTRIBUTION,
b.RESOURCE_DISTRIBUTION,
a.LOCATION_DESC,
a.DEPARTMENT_DESC,
a.LANGUAGE_DESC,
a.JOB_TITLE_DESC,

FROM
03_01_schedule a
LEFT JOIN 01_01_resource_utilization_prepared b ON (
    a.RESOURCE_PROFILE_ID = b.RESOURCE_PROFILE_ID
    AND a.DATE >= b.EFFECTIVE_FROM
    AND a.DATE <= b.EFFECTIVE_TO
)

Hi, and welcome to StackOverflow. There's a few important pieces of information we need to help. The schemas of the tables involved, please [edit your question](https://stackoverflow.com/posts/50750735/edit) with the output of [`show create table`](https://dev.mysql.com/doc/refman/5.7/en/show-create-table.html) for each table in the query. This will show us the types and what is indexed, a key to performance. It will also help to see the [`explain` for that statement](https://dev.mysql.com/doc/refman/5.7/en/using-explain.html). Finally, show us a sample of your input and the output you want. — Schwern, Jun 07 '18 at 23:09

score 1 · Answer 1 · answered Jun 08 '18 at 03:00

Does 01_01 refer to Jan 01? If so, I suggest that is a bad way to lay out the data. But meanwhile...

Checking for within a range, where the range comes from another table is hard to optimize. These composite indexes on b will help a little:

INDEX(RESOURCE_PROFILE_ID, EFFECTIVE_FROM)
INDEX(RESOURCE_PROFILE_ID, EFFECTIVE_TO)

Is LEFT needed? If it can be removed without destroying the semantics, then a much better option avails itself. Removing LEFT would let this be useful on a:

INDEX(RESOURCE_PROFILE_ID, `DATE`)

(Meanwhile, I did not understand the relevance of anything you said in your first paragraph.)

Schwern · Answer 2 · 2018-06-08T08:01:00.033

0

Without more information I can't say exactly, but performance will depend on indexing the columns you're comparing. Without indexes the join may have to scan every row, a "full table scan".

It's pretty common in MySQL to forget to declare foreign keys. 03_01_schedule.RESOURCE_PROFILE_ID and 01_01_resource_utilization_prepared.RESOURCE_PROFILE_ID should be declared as foreign keys and they will be indexed. This will make the basic join much faster and also supply referential integrity.

03_01_schedule.DATE, 01_01_resource_utilization_prepared.EFFECTIVE_FROM, and 01_01_resource_utilization_prepared.EFFECTIVE_TO should all be indexed. This will make comparisons using those columns much faster.

edited Jun 08 '18 at 08:01

answered Jun 07 '18 at 23:15

Schwern

153,029
25
195
336

A FK implicitly creates an `INDEX` if one does not already exist. Because an FK implies a constraint to check, it actually slows down the code a little (even assuming that the index exists). – Rick James Jun 08 '18 at 02:58
`INDEX(a), INDEX(b)` is _not_ the same as `INDEX(a,b)`. In many cases (including this one) the composite key is _better_. – Rick James Jun 08 '18 at 02:59
@RickJames Yes, I worded that bit about FKs and indexes poorly. I'll fix that. As for the specific indexes, without more information it's hard to know. – Schwern Jun 08 '18 at 08:00

Increase performance of a one-to-many join

2 Answers2